FlickrBelgaLogos Dataset
Content-based logos and trademarks retrieval in large natural image collections is of high interest for many applications, including dissemination impact evaluation, prohibited or suspicious logos detection, automatic annotation, etc. Trademarks recognition has been widely addressed by the pattern recognition community in the last decades but there is surprisingly very few works dealing with natural images collections. The FlickrBelgaLogos dataset was specifically created for this purpose in the scope of the French ANR project OTMedia, using the logos from the BelgaLogos dataset.
Images
Evaluating the accuracy of object discovery and mining algorithms is more challenging than evaluating object retrieval with a pre-fixed set of queries. We actually need a complete groundtruth with all repeated objects of the dataset and with the precise location of all their instances. No previous evaluation dataset meeting these objectives exists, so that a contribution of this paper was to build one.
We first extended the image-level groundtruth of
BelgaLogos dataset by annotating manually the bounding boxes of all instances of the 37 targeted logos (correcting few errors along the way). The 9842 annotated instances were then visually classified as kept or rejected by 3 users, depending on whether or not they were all able to recognize the instance with confidence after it had been cropped from its image. After this step, only 2695 instances were classified as kept.
This extended annotation is however not sufficient to evaluate the precision of object mining algorithms. Besides the 37 logos, other objects are actually instantiated several times in the dataset as well (including other logos, buildings, faces, near duplicates, etc.), so that they would be considered as false positives when detected. We therefore decided to create a new synthetic dataset by cutting and pasting the cropped logos of BelgaLogos II into a dataset of 10K distractor images crawled from Flickr. To reduce the probability of finding repeated objects in the distractors, all images come from distinct users and distinct geographic areas. The BelgaLogos instances were then pasted without any modifications (rotation or scaling, ...) at random positions in the distractors. Here are some thumbnails examples:
top
Annotations
The 10,000 images of BelgaLogos dataset have been manually annotated. Every logos (37 differents logos) have been surrounded with a rectangular bounding box.
Logo name | Illustration | #OK | #Junk | Total |
Adidas | | 147 | 896 | 1043 |
Adidas-text | | 63 | 115 | 178 |
Airness | | 11 | 109 | 120 |
Base | | 162 | 86 | 248 |
BFGoodrich | | 86 | 222 | 308 |
Bik | | 65 | 205 | 270 |
Bouygues | | 14 | 18 | 32 |
Bridgestone | | 31 | Junk | 105 |
Bridgestone-text | | 64 | 74 | 201 |
Carglass | | 18 | 47 | 65 |
Citroen | | 78 | 164 | 242 |
Citroen-text | | 197 | 134 | 331 |
CocaCola | | 40 | 33 | 73 |
Cofidis | | 45 | 45 | 90 |
Dexia | | 235 | 391 | 626 |
ELeclerc | | 15 | 5 | 20 |
Ferrari | | 77 | 136 | 213 |
Gucci | | 2 | 2 | 4 |
Kia | | 141 | 101 | 242 |
|
Logo name | Illustration | #OK | #Junk | Total |
Mercedes | | 86 | 193 | 279 |
Nike | | 235 | 2007 | 2242 |
Peugeot | | 5 | 2 | 7 |
Puma | | 157 | 643 | 800 |
Puma-text | | 27 | 53 | 80 |
Quick | | 57 | 196 | 253 |
Reebok | | 18 | 48 | 66 |
Roche | | 2 | 0 | 2 |
Shell | | 123 | 113 | 236 |
SNCF | | 7 | 3 | 10 |
Std-Liege | | 98 | 283 | 381 |
StellaArtois | | 20 | 8 | 28 |
TNT | | 102 | 81 | 183 |
Total | | 78 | 18 | 96 |
US-President | | 14 | 0 | 14 |
Umbro | | 153 | 506 | 659 |
Veolia | | 12 | 65 | 77 |
VRT | | 10 | 8 | 18 |
|
top
Download
To download the FlickrBelgaLogos dataset, please send an email to
belgalogos@inria.fr with the following information:
top
References
All publications making use of the FlickrBelgaLogos dataset must include the following reference:
Pierre Letessier, Olivier Buisson, Alexis Joly, Scalable Mining of Small Visual Objects, In Proceedings of the 20th ACM international Conference on Multimedia, 2012.
@inproceedings{letessier12,
author = {Letessier, Pierre and Joly, Alexis and Buisson, Olivier},
title = {Scalable Mining of Small Visual Objects},
booktitle = {MM '12: Proceedings of the 20th ACM international conference on Multimedia},
year = {2012},
}
top
Related publications
Please send your publications related to BelgaLogos at
belgalogos@inria.fr
top
Alexis Joly, alexis(dot)joly(at)inria.fr
Pierre Letessier,
top