Belga INRIA vitalas

BelgaLogos Dataset

Content-based logos and trademarks retrieval in large natural image collections is of high interest for many applications, including dissemination impact evaluation, prohibited or suspicious logos detection, automatic annotation, etc. Trademarks recognition has been widely addressed by the pattern recognition community in the last decades but there is surprisingly very few works dealing with natural images collections. BelgaLogos dataset was specifically created for this purpose in the scope of the European project VITALAS and with the major contribution of BELGA press agency

Images
Annotations
Queries
Download
References
Related publications
Contact

Images

The images of BelgaLogos dataset have been provided and are copyrighted by BELGA press agency. They are freely available for research purpose only. The dataset is composed of 10,000 images covering all aspects of life and current affairs: politics and economics, finance and social affairs, sports, culture and personalities. All images are in JPEG format and have been re-sized with a maximum value of height and width equal to 800 pixels, preserving aspect ratio. Here are some thumbnails examples:

Download images

top

Annotations

The 10,000 images of BelgaLogos dataset have been manually annotated. Two different groundtruth are provided: a global groundtruth and a local groundtruth.

Global groundtruth

In this one, each image is labelled for each logo (26 differents logos) with 1 if the logo is actually present in the image and with 0 if it is not. A given image can contain one or several logos or no logo at all. The localization of the logo is not provided for all 10K images, but only for the queries (see next section). The list of logos that were annotated is given in the following table with an illustration of the targeted object. Logos having a bounding box with a minimum value of height and width lower than 10 pixels were not annotated.

Local groundtruth

In the local groundtruth, every logos (37 differents logos) have been surrounded with a rectangular bounding box. A given image can contain several bounding boxes. The annotated instances have then been visually classified as "OK" or "junk" by a set of 3 users, according to their ability to easily recognize an instance without the image context.

Logo nameIllustration#OK #JunkTotal
Adidas147 896 1043
Adidas-text63 115 178
Airness11 109 120
Base162 86 248
BFGoodrich86222 308
Bik65 205270
Bouygues14 1832
Bridgestone31 Junk105
Bridgestone-text64 74201
Carglass18 4765
Citroen78 164242
Citroen-text197 134331
CocaCola40 3373
Cofidis45 4590
Dexia235 391626
ELeclerc15 520
Ferrari77 136213
Gucci22 4
Kia141 101242
Logo nameIllustration#OK #JunkTotal
Mercedes86 193 279
Nike235 2007 2242
Peugeot6 2 8
Puma157 643 800
Puma-text27 53 80
Quick57 196 253
Reebok18 48 66
Roche2 0 2
Shell123 113 236
SNCF7 3 10
Std-Liege98 283 381
StellaArtois21 8 29
TNT102 81 183
Total78 18 96
US-President 14 0 14
Umbro153 506 659
Veolia12 65 77
VRT10 8 18

Download local groundtruth

top

Queries

Three distinct pools of queries can be used for evaluation, Qset1, Qset2, and Qset3:

  • Qset1 is composed of 55 internal queries, each defined by an image name and the coordinates of the logo bounding box in this image. Logos being the most frequent in the dataset (see above table) are represented by more queries than less frequent ones. Queries targeting the same logo have the same root name and a iterative number (ex: Addidas1, Addidas2, etc.).

    Download internal Qset1 queries

    Download internal Qset1 grountruth


  • Qset2 is composed of 26 jpeg thumbnails downloaded from Google first result page after querying 'logo $logoname'. The logo illustrations provided in above table are re-sized versions of the 26 thumbnails composing Qset2.

    Download external Qset2 queries

    Download external Qset2 grountruth


  • Qset3 is composed of 2697 internal queries, representing all the "OK annotated" instances of the 37 logos, each defined by an image name and the coordinates of the logo bounding box in this image. Queries targeting the same logo have the same root name and an incremental number (ex: Adidas1, Adidas2, etc.).

    Download internal and local Qset3 queries

    Download internal and local Qset3 grountruth


  • top

    Evaluation

    Evaluation Metric

    The primary metric used for the evaluation is the Mean Average Precision over all queries of a given query set (Qset1, Qset2 or Qset3). Each query has to be searched independently from all other queries (even when a targeted logo is represented by several queries). Average precision is computed for each query and the mean over the query set is computed afterwards.
    Secondary metrics can be used to study in detail the performances for each of the 26 logos. In this case the Mean Average Precision has to be computed as the mean of the average precisions of each query targeting the same logo (in a given query set).

    Evaluation softwares

    Qset1 and Qset2 are evaluated with trec_eval
    Qset3 is evaluated with a dedicated software (BelgaLogosEval) using the spatial position of the instances.

    Download BelgaLogosEval


    top

    Download

    Download the full BelgaLogos package


    top

    References

    All publications making use of BelgaLogos dataset must include the following reference:

    Alexis Joly and Olivier Buisson, Logo retrieval with a contrario visual query expansion, In Proceedings of the Seventeen ACM international Conference on Multimedia, 2009.

    @inproceedings{belgalogos09,
    author = {Joly, Alexis and Buisson, Olivier},
    title = {Logo retrieval with a contrario visual query expansion},
    booktitle = {MM '09: Proceedings of the seventeen ACM international conference on Multimedia},
    year = {2009},
    pages = {581--584},
    }

    If you use the local groundtruth or the Qset3 queries, you must include the following reference:

    Pierre Letessier, Olivier Buisson, Alexis Joly, Scalable Mining of Small Visual Objects, In Proceedings of the 20th ACM international Conference on Multimedia, 2012.

    @inproceedings{letessier2012scalable,
    title={Scalable mining of small visual objects},
    author={Letessier, Pierre and Buisson, Olivier and Joly, Alexis},
    booktitle={Proceedings of the 20th ACM international conference on Multimedia},
    pages={599--608},
    year={2012},
    organization={ACM}
    }

    top

    Related publications

    Please send your publications related to BelgaLogos at belgalogos@inria.fr

    top

    Contact

    Alexis Joly, alexis(dot)joly(at)inria.fr
    Pierre Letessier,

    top

    INRIA - Rocquencourt - IMEDIA Project

    INRIA   - updated July 3, 2012