Works
Main.Works History
Hide minor edits - Show changes to markup
http://www-sop.inria.fr/members/Alexis.Joly/loss.png |
[Trans. on Multimedia 2017]\\
[Author version] [Transanctions on Multimedia]\\
[Springer book] [Author version]\\
[Author version] [Springer book]\\
[Springer book] [Author version]\\
[Author version] [Springer book]\\
[Springer book]] [[Author version]\\
[Springer book] [Author version]\\
[Author version] [Springer book]
[Springer book]] [[Author version]\\
[Author version]
[Author version]\\
[Author version] [Springer book]
[Springer version]
[Hal version]
[Springer book] [Author version]
purposes. Given a set of species occurrence, the aim is to infer its spatial distribution over a given territory. Because of the limited number of occurrences of specimens, this is usually achieved through environmental niche modeling approaches, i.e. by predicting the distribution in the geographic space on the basis of a mathematical representation of their known distribution in environmental space (= realized ecological niche). The environment is in most cases represented by climate data (such as temperature, and precipitation), but other variables such as soil type or land cover can also be used. In this paper, we propose a deep learning approach to the problem in
purposes. The environment is in most cases represented by climate data (such as temperature, and precipitation) and other variables such as soil type or land cover can also be used. In this paper, we propose a deep learning approach to the problem in
A deep learning approach to Species Distribution Modelling
Species distribution models (SDM) are widely used for ecological research and conservation purposes. Given a set of species occurrence, the aim is to infer its spatial distribution over a given territory. Because of the limited number of occurrences of specimens, this is usually achieved through environmental niche modeling approaches, i.e. by predicting the distribution in the geographic space on the basis of a mathematical representation of their known distribution in environmental space (= realized ecological niche). The environment is in most cases represented by climate data (such as temperature, and precipitation), but other variables such as soil type or land cover can also be used. In this paper, we propose a deep learning approach to the problem in order to improve the predictive effectiveness. Non-linear prediction models have been of interest for SDM for more than a decade but our study is the first one bringing empirical evidence that deep, convolutional and multilabel models might participate to resolve the limitations of SDM. Indeed, the main challenge is that the realized ecological niche is often very different from the theoretical fundamental niche, due to environment perturbation history, species propagation constraints and biotic interactions. Thus, the realized abundance in the environmental feature space can have a very irregular shape that can be difficult to capture with classical models. Deep neural networks on the other side, have been shown to be able to learn complex non-linear transformations in a wide variety of domains. Moreover, spatial patterns in environmental variables often contains useful information for species distribution but are usually not considered in classical models. Our study shows empirically how convolutional neural networks efficiently use this information and improve prediction performance.
[Hal version]\\
[Hal version]
[Editor version (Springer)]\\
[Springer version]]
[[Hal version]\\
[Editor version]\\
[Editor version (Springer)]]
[[Hal version]\\
[HDRThesis2015] HDR habilitation (highest French academic qualification) - defended the 26/05/2015 \\
[Editor version]]
\\
https://images-na.ssl-images-amazon.com/images/I/519yVLqVRwL.jpg | This edited volume focuses on the latest and most impactful advancements of multimedia data globally available for environmental and earth biodiversity. The data reflects the status, behavior, change as well as human interests and concerns which are increasingly crucial for understanding environmental issues and phenomena. This volume addresses the need for the development of advanced methods, techniques and tools for collecting, managing, analyzing, understanding and modeling environmental & biodiversity data, including the automated or collaborative species identification, the species distribution modeling and their environment, such as the air quality or the bio-acoustic monitoring. Researchers and practitioners in multimedia and environmental topics will find the chapters essential to their continued studies. |
https://images-na.ssl-images-amazon.com/images/I/519yVLqVRwL.jpg | This edited volume focuses on the latest and most impactful advancements of multimedia data globally available for environmental and earth biodiversity. The data reflects the status, behavior, change as well as human interests and concerns which are increasingly crucial for understanding environmental issues and phenomena. This volume addresses the need for the development of advanced methods, techniques and tools for collecting, managing, analyzing, understanding and modeling environmental & biodiversity data, including the automated or collaborative species identification, the species distribution modeling and their environment, such as the air quality or the bio-acoustic monitoring. Researchers and practitioners in multimedia and environmental topics will find the chapters essential to their continued studies. |
https://images-na.ssl-images-amazon.com/images/I/519yVLqVRwL.jpg | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
https://images-na.ssl-images-amazon.com/images/I/519yVLqVRwL.jpg | This edited volume focuses on the latest and most impactful advancements of multimedia data globally available for environmental and earth biodiversity. The data reflects the status, behavior, change as well as human interests and concerns which are increasingly crucial for understanding environmental issues and phenomena. This volume addresses the need for the development of advanced methods, techniques and tools for collecting, managing, analyzing, understanding and modeling environmental & biodiversity data, including the automated or collaborative species identification, the species distribution modeling and their environment, such as the air quality or the bio-acoustic monitoring. Researchers and practitioners in multimedia and environmental topics will find the chapters essential to their continued studies. |
https://images-na.ssl-images-amazon.com/images/I/519yVLqVRwL.jpg | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
https://images-na.ssl-images-amazon.com/images/I/519yVLqVRwL.jpg | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
Large-scale Content-based Visual Information Retrieval
Multimedia Tools and Applications for Environmental & Biodiversity Informatics
https://media.springernature.com/w306/springer-static/cover-hires/book/978-3-319-76445.jpeg | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
https://images-na.ssl-images-amazon.com/images/I/519yVLqVRwL.jpg | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
https://media.springernature.com/w306/springer-static/cover-hires/book/978-3-319-76445-0.jpeg | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
https://media.springernature.com/w306/springer-static/cover-hires/book/978-3-319-76445.jpeg | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
https://media.springernature.com/w306/springer-static/cover-hires/book/978-3-319-76445-0.png | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
https://media.springernature.com/w306/springer-static/cover-hires/book/978-3-319-76445-0.jpeg | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
https://media.springernature.com/w306/springer-static/cover-hires/book/978-3-319-76445-0.jpg | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
https://media.springernature.com/w306/springer-static/cover-hires/book/978-3-319-76445-0.png | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
https://media.springernature.com/w306/springer-static/cover-hires/book/978-3-319-76445-0 | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
https://media.springernature.com/w306/springer-static/cover-hires/book/978-3-319-76445-0.jpg | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
Large-scale Content-based Visual Information Retrieval
[HDRThesis2015] HDR habilitation (highest French academic qualification) - defended the 26/05/2015
https://media.springernature.com/w306/springer-static/cover-hires/book/978-3-319-76445-0 | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
http://www-sop.inria.fr/members/Alexis.Joly/tpg.png |
http://www-sop.inria.fr/members/Alexis.Joly/tpg.png |
http://www-sop.inria.fr/members/Alexis.Joly/tpg.png |
http://www-sop.inria.fr/members/Alexis.Joly/tpg.png |
http://www-sop.inria.fr/members/Alexis.Joly/tpg.png |
In classical crowdsourcing frameworks, the labels correspond to well known or easy-to-learn concepts so that it is straightforward to train the annotators by giving a few examples with known answers. Neither is true when there are thousands of complex domain-specific labels. The originality of this work is to focus on such annotations that usually require hard expert knowledge (such as plant species names, architectural styles, medical diagnostic tags, etc.). We consider that common knowledge is not sufficient to perform the task but any people can be taught to recognize a small subset of domain-specific concepts. In such a context, it is best to take advantage of the various capabilities of each annotator through teaching (annotators can enhance their knowledge), assignment (annotators can be focused on tasks they have the knowledge to complete) and inference (different annotator propositions can be aggregated to enhance labeling quality). In this work, is a set of theoretical contributions and data-driven algorithms to allow the crowdsourcing of thousands of specialized labels thanks to the pro-active training of the annotators. The framework relies on deep learning, variational Bayesian inference and task assignment to adapt to the skills of each annotator both in the questions asked and the weights given to their answers. The underlying judgements are Bayesian, based on adaptive priors. To achieve live experiments, the whole framework has been implemented as a serious game available on the web ().
In classical crowdsourcing frameworks, the labels correspond to well known or easy-to-learn concepts so that it is straightforward to train the annotators by giving a few examples with known answers. Neither is true when there are thousands of complex domain-specific labels. The originality of this work is to focus on such annotations that usually require hard expert knowledge (such as plant species names, architectural styles, medical diagnostic tags, etc.). We consider that common knowledge is not sufficient to perform the task but any people can be taught to recognize a small subset of domain-specific concepts. In such a context, it is best to take advantage of the various capabilities of each annotator through teaching (annotators can enhance their knowledge), assignment (annotators can be focused on tasks they have the knowledge to complete) and inference (different annotator propositions can be aggregated to enhance labeling quality). In this work, is a set of theoretical contributions and data-driven algorithms to allow the crowdsourcing of thousands of specialized labels thanks to the pro-active training of the annotators. The framework relies on deep learning, variational Bayesian inference and task assignment to adapt to the skills of each annotator both in the questions asked and the weights given to their answers. The underlying judgements are Bayesian, based on adaptive priors. To achieve live experiments, the whole framework has been implemented as a serious game available on the web (ThePlantGame).
In classical crowdsourcing frameworks, the labels correspond to well known or easy-to-learn concepts so that it is straightforward to train the annotators by giving a few examples with known answers. Neither is true when there are thousands of complex domain-specific labels. The originality of this work is to focus on such annotations that usually require hard expert knowledge (such as plant species names, architectural styles, medical diagnostic tags, etc.). We consider that common knowledge is not sufficient to perform the task but any people can be taught to recognize a small subset of domain-specific concepts. In such a context, it is best to take advantage of the various capabilities of each annotator through teaching (annotators can enhance their knowledge), assignment (annotators can be focused on tasks they have the knowledge to complete) and inference (different annotator propositions can be aggregated to enhance labeling quality). In this work, is a set of theoretical contributions and data-driven algorithms to allow the crowdsourcing of thousands of specialized labels thanks to the pro-active training of the annotators. The framework relies on deep learning, variational Bayesian inference and task assignment to adapt to the skills of each annotator both in the questions asked and the weights given to their answers. The underlying judgements are Bayesian, based on adaptive priors. To achieve live experiments, the whole framework has been implemented as a serious game available on the web (www.theplantgame.com).
In classical crowdsourcing frameworks, the labels correspond to well known or easy-to-learn concepts so that it is straightforward to train the annotators by giving a few examples with known answers. Neither is true when there are thousands of complex domain-specific labels. The originality of this work is to focus on such annotations that usually require hard expert knowledge (such as plant species names, architectural styles, medical diagnostic tags, etc.). We consider that common knowledge is not sufficient to perform the task but any people can be taught to recognize a small subset of domain-specific concepts. In such a context, it is best to take advantage of the various capabilities of each annotator through teaching (annotators can enhance their knowledge), assignment (annotators can be focused on tasks they have the knowledge to complete) and inference (different annotator propositions can be aggregated to enhance labeling quality). In this work, is a set of theoretical contributions and data-driven algorithms to allow the crowdsourcing of thousands of specialized labels thanks to the pro-active training of the annotators. The framework relies on deep learning, variational Bayesian inference and task assignment to adapt to the skills of each annotator both in the questions asked and the weights given to their answers. The underlying judgements are Bayesian, based on adaptive priors. To achieve live experiments, the whole framework has been implemented as a serious game available on the web ().
Crowdsourcing Thousands of Specialized Labels: a Bayesian active training approach
In classical crowdsourcing frameworks, the labels correspond to well known or easy-to-learn concepts so that it is straightforward to train the annotators by giving a few examples with known answers. Neither is true when there are thousands of complex domain-specific labels. The originality of this work is to focus on such annotations that usually require hard expert knowledge (such as plant species names, architectural styles, medical diagnostic tags, etc.). We consider that common knowledge is not sufficient to perform the task but any people can be taught to recognize a small subset of domain-specific concepts. In such a context, it is best to take advantage of the various capabilities of each annotator through teaching (annotators can enhance their knowledge), assignment (annotators can be focused on tasks they have the knowledge to complete) and inference (different annotator propositions can be aggregated to enhance labeling quality). In this work, is a set of theoretical contributions and data-driven algorithms to allow the crowdsourcing of thousands of specialized labels thanks to the pro-active training of the annotators. The framework relies on deep learning, variational Bayesian inference and task assignment to adapt to the skills of each annotator both in the questions asked and the weights given to their answers. The underlying judgements are Bayesian, based on adaptive priors. To achieve live experiments, the whole framework has been implemented as a serious game available on the web (www.theplantgame.com).
[ICMR2011]] [[ACM-MM2012]\\
[ICMR2011] [ACM-MM2012]\\
[ICMR2011] [ACM-MM2012]\\
[ICMR2011]] [[ACM-MM2012]\\
[ACM-MM2012] [ACM-MM2012]
Our objects mining and retrieval techniques were integrated within a visual-based media event detection system in the scope of a French project called the transmedia observatory. It allows the automatic discovery of the most circulated images across the main news media (news websites, press agencies, TV news and newspapers). The main originality of the detection is to rely on the transmedia contextual information to denoise the raw visual detections and consequently focus on the most salient trans-media events. This work was presented at ACM Multimedia Grand Challenge 2012 and obtained a grant. The movie presented during this event is available here:
[ICMR2011] [ACM-MM2012]
A Phd student of mine (Riadh Trad) did work on visual-based event retrieval and discovery in social data (Flickr images). He built a new event records matching technique making use of both the visual content and the social context pdf.
Besides, our objects mining and retrieval techniques were integrated within a visual-based media event detection system in the scope of a French project called the transmedia observatory. It allows the automatic discovery of the most circulated images across the main news media (news websites, press agencies, TV news and newspapers). The main originality of the detection is to rely on the transmedia contextual information to denoise the raw visual detections and consequently focus on the most salient trans-media events. This work was presented at ACM Multimedia Grand Challenge 2012 and obtained a grant. The movie presented during this event is available here:
Besides, another Phd student of mine (Riadh Trad) did work on visual-based event retrieval and discovery in social data (Flickr images). He built a new event records matching technique making use of both the visual content and the social context pdf.
ACM-MM2012 ACM-MM2012\\
[ACM-MM2012] [ACM-MM2012]\\
ACM-MM2012 ACM-MM2012\\
ACM-MM2012 ACM-MM2012\\
\[ACM-MM2012\]] [\[ACM-MM2012\]\\
ACM-MM2012 ACM-MM2012\\
[ACM-MM2012]] || [[ACM-MM2012]
\[ACM-MM2012\]] [\[ACM-MM2012\]
[ICMR2011]\\
[ACM-MM2012]] [[ACM-MM2012]\\
[ACM-MM2012]], [[ACM-MM2012]\\
[ACM-MM2012]] || [[ACM-MM2012]\\
[ACM-MM2012]] [[ICMR2011]
[ACM-MM2012]], [[ACM-MM2012]
[ICMR2011]\\
[ACM-MM2012]
[ACM-MM2012]] [[ICMR2011]
[ICMR2011]\\
[ICMR2011]\\
[ICMR2011]\\
[ACM-MM2012] [[ICMR2011]
[ACM-MM2012]]
[[ICMR2011]\\
[ACM-MM2012] [[http://dl.acm.org/citation.cfm?id=1992049|[[ICMR2011]]\\
[ACM-MM2012] [[ICMR2011]
[ACM-MM2012]] [[ICMR2011]\\
[ACM-MM2012] [[http://dl.acm.org/citation.cfm?id=1992049|[[ICMR2011]]\\
[ACM-MM2012]] [[ICMR2011]
Our objects mining and retrieval techniques were integrated within a visual-based media event detection system in the scope of a French project called the transmedia observatory. It allows the automatic discovery of the most circulated images across the main news media (news websites, press agencies, TV news and newspapers). The main originality of the detection is to rely on the transmedia contextual information to denoise the raw visual detections and consequently focus on the most salient trans-media events. This work was presented at ACM Multimedia Grand Challenge 2012 pdf and obtained a grant. The movie presented during this event is available here:
[ACM-MM2012]] [[ICMR2011]
Our objects mining and retrieval techniques were integrated within a visual-based media event detection system in the scope of a French project called the transmedia observatory. It allows the automatic discovery of the most circulated images across the main news media (news websites, press agencies, TV news and newspapers). The main originality of the detection is to rely on the transmedia contextual information to denoise the raw visual detections and consequently focus on the most salient trans-media events. This work was presented at ACM Multimedia Grand Challenge 2012 and obtained a grant. The movie presented during this event is available here:
Our objects mining technique was integrated within a visual-based media event detection system in the scope of a French project called the transmedia observatory. It allows the automatic discovery of the most circulated images across the main news media (news websites, press agencies, TV news and newspapers). The main originality of the detection is to rely on the transmedia contextual information to denoise the raw visual detections and consequently focus on the most salient trans-media events. This work was presented at ACM Multimedia Grand Challenge 2012 pdf. The movie presented during this event is available here:
[ACM-MM2012]] [[ICMR2011]
Our objects mining and retrieval techniques were integrated within a visual-based media event detection system in the scope of a French project called the transmedia observatory. It allows the automatic discovery of the most circulated images across the main news media (news websites, press agencies, TV news and newspapers). The main originality of the detection is to rely on the transmedia contextual information to denoise the raw visual detections and consequently focus on the most salient trans-media events. This work was presented at ACM Multimedia Grand Challenge 2012 pdf and obtained a grant. The movie presented during this event is available here:
http://www.imageclef.org/system/files/bannerImageCLEF2013PlantTaskMini.png
http://www.imageclef.org/system/files/bannerImageCLEF2013PlantTaskMini.png
The data collected through this workflow is used each year since 2011 in the ImageCLEF and LifeCLEF evaluation campaigns that I am coordinating: http://www.imageclef.org/system/files/bannerImageCLEF2013PlantTaskMini.png
The data collected through this workflow is used each year in the LifeCLEF evaluation campaign that I am coordinating since 2011: http://www.imageclef.org/system/files/bannerImageCLEF2013PlantTaskMini.png
The data collected through this workflow is used each year since 2011 in ImageCLEF evaluation campaign where I am chair of the plant id task: http://www.imageclef.org/system/files/bannerImageCLEF2013PlantTaskMini.png
The data collected through this workflow is used each year since 2011 in the ImageCLEF and LifeCLEF evaluation campaigns that I am coordinating: http://www.imageclef.org/system/files/bannerImageCLEF2013PlantTaskMini.png
[EcologicalInformatics2013]\\
[EcologicalInformatics2014]\\
[ACM-MM2012]
[ACM-MM2012]\\
[CVPR2011]
http://www-sop.inria.fr/members/Alexis.Joly/equation.png
http://www-sop.inria.fr/members/Alexis.Joly/equation.png
http://www-sop.inria.fr/members/Alexis.Joly/equation.png
http://www-sop.inria.fr/members/Alexis.Joly/equation.png
http://www-sop.inria.fr/members/Alexis.Joly/equation.png
http://www-sop.inria.fr/members/Alexis.Joly/archi.png | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
http://www-sop.inria.fr/members/Alexis.Joly/archi.png | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
http://www-sop.inria.fr/members/Alexis.Joly/archi.png | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. |
The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors, whatever their origin or role in the upper levels (visual or audio features, global or part-based descriptions, low or high semantic level, etc. ). The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. It typically extracts and embeds the low-level features, implements the querying mechanisms and post-processes the results returned by the lower layer. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. It typically implements the front-end of the search application, the crawler and the orchestration of the different indexing and search services.
http://www-sop.inria.fr/members/Alexis.Joly/archi.png | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors. The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. |
http://www-sop.inria.fr/members/Alexis.Joly/archi.png | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. |
http://www-sop.inria.fr/members/Alexis.Joly/archi.png | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. |
http://www-sop.inria.fr/members/Alexis.Joly/archi.png | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors, whatever their origin or role in the upper levels (visual or audio features, global or part-based descriptions, low or high semantic level, etc. ). The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. It typically extracts and embeds the low-level features, implements the querying mechanisms and post-processes the results returned by the lower layer. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. It typically implements the front-end of the search application, the crawler and the orchestration of the different indexing and search services. |
http://www-sop.inria.fr/members/Alexis.Joly/archi.png | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. |
The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors, whatever their origin or role in the upper levels (visual or audio features, global or part-based descriptions, low or high semantic level, etc. ). The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. It typically extracts and embeds the low-level features, implements the querying mechanisms and post-processes the results returned by the lower layer. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. It typically implements the front-end of the search application, the crawler and the orchestration of the different indexing and search services.
http://www-sop.inria.fr/members/Alexis.Joly/archi.png | Plantnet iphone and android app: an image sharing and retrieval application for the identification of plants. It is developed in the context of the Pl@ntNet project by scientists from four French research organisations (INRIA, Cirad, INRA, IRD) and the members of Tela Botanica social network with the financial support of Agropolis fondation. Among other features, this free app helps identifying plant species from photographs, through a visual search engine using several of my works (Large-scale matching, A posteriori multi-probe, RMMH). Pl@ntNet is now on Facebook and Twitter |
Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors, whatever their origin or role in the upper levels (visual or audio features, global or part-based descriptions, low or high semantic level, etc. ). The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. It typically extracts and embeds the low-level features, implements the querying mechanisms and post-processes the results returned by the lower layer. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. It typically implements the front-end of the search application, the crawler and the orchestration of the different indexing and search services. |
http://www-sop.inria.fr/members/Alexis.Joly/archi.png | Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors, whatever their origin or role in the upper levels (visual or audio features, global or part-based descriptions, low or high semantic level, etc. ). The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. It typically extracts and embeds the low-level features, implements the querying mechanisms and post-processes the results returned by the lower layer. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. It typically implements the front-end of the search application, the crawler and the orchestration of the different indexing and search services. |
http://www-sop.inria.fr/members/Alexis.Joly/PlantNet.png | Plantnet iphone and android app: an image sharing and retrieval application for the identification of plants. It is developed in the context of the Pl@ntNet project by scientists from four French research organisations (INRIA, Cirad, INRA, IRD) and the members of Tela Botanica social network with the financial support of Agropolis fondation. Among other features, this free app helps identifying plant species from photographs, through a visual search engine using several of my works (Large-scale matching, A posteriori multi-probe, RMMH). Pl@ntNet is now on Facebook and Twitter |
http://www-sop.inria.fr/members/Alexis.Joly/archi.png | Plantnet iphone and android app: an image sharing and retrieval application for the identification of plants. It is developed in the context of the Pl@ntNet project by scientists from four French research organisations (INRIA, Cirad, INRA, IRD) and the members of Tela Botanica social network with the financial support of Agropolis fondation. Among other features, this free app helps identifying plant species from photographs, through a visual search engine using several of my works (Large-scale matching, A posteriori multi-probe, RMMH). Pl@ntNet is now on Facebook and Twitter |
http://www-sop.inria.fr/members/Alexis.Joly/PlantNet.png | Plantnet iphone and android app: an image sharing and retrieval application for the identification of plants. It is developed in the context of the Pl@ntNet project by scientists from four French research organisations (INRIA, Cirad, INRA, IRD) and the members of Tela Botanica social network with the financial support of Agropolis fondation. Among other features, this free app helps identifying plant species from photographs, through a visual search engine using several of my works (Large-scale matching, A posteriori multi-probe, RMMH). Pl@ntNet is now on Facebook and Twitter |
Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors, whatever their origin or role in the upper levels (visual or audio features, global or part-based descriptions, low or high semantic level, etc. ). The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. It typically extracts and embeds the low-level features, implements the querying mechanisms and post-processes the results returned by the lower layer. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. It typically implements the front-end of the search application, the crawler and the orchestration of the different indexing and search services. |
http://www-sop.inria.fr/members/Alexis.Joly/archi.png Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors, whatever their origin or role in the upper levels (visual or audio features, global or part-based descriptions, low or high semantic level, etc. ). The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. It typically extracts and embeds the low-level features, implements the querying mechanisms and post-processes the results returned by the lower layer. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. It typically implements the front-end of the search application, the crawler and the orchestration of the different indexing and search services.
http://www-sop.inria.fr/members/Alexis.Joly/archi.png
http://www-sop.inria.fr/members/Alexis.Joly/archi.png
archi.png
http://www-sop.inria.fr/members/Alexis.Joly/archi.png
This paper introduces a new image representation relying on the spatial pooling of geometrically consistent visual matches. We therefore introduce a new match kernel based on the inverse rank of the shared nearest neighbors combined with local geometric constraints. To avoid overfitting and reduce processing costs, the dimensionality of the resulting over-complete representation is further reduced by hierarchically pooling the raw consistent matches according to their spatial position in the training images. The final image representation is obtained by concatenating the resulting feature vectors at several resolutions. Learning from these representations using a logistic regression classifier is shown to provide excellent fine-grained classification performances outperforming the results reported in the literature on several classification tasks.
Kernelizing Spatially Consistent Visual Matches
[ICMR2015]
HDR habilitation (highest French academic qualification) - defended the 26/05/2015 [HDRThesis2015]\\
[HDRThesis2015] HDR habilitation (highest French academic qualification) - defended the 26/05/2015 \\
[EcologicalInformatics2013]\\
[EcologicalInformatics2013]\\
HDR habilitation (highest French academic qualification) - defended the 26/05/2015 [pdf]\\
HDR habilitation (highest French academic qualification) - defended the 26/05/2015 [HDRThesis2015]\\
HDR habilitation (highest French academic qualification) - defended the 26/05/2015 [pdf]\\
HDR habilitation (highest French academic qualification) - defended the 26/05/2015 [pdf]\\
Large-scale Content-based Visual Information Retrieval [pdf]
HDR habilitation (highest French academic qualification) - defended the 26/05/2015\\
Large-scale Content-based Visual Information Retrieval
HDR habilitation (highest French academic qualification) - defended the 26/05/2015 [pdf]\\
HDR habilitation: Large-scale Content-based Visual Information Retrieval [pdf]
Large-scale Content-based Visual Information Retrieval [pdf]
Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to large-scale content-based information retrieval. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors, whatever their origin or role in the upper levels (visual or audio features, global or part-based descriptions, low or high semantic level, etc. ). The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. It typically extracts and embeds the low-level features, implements the querying mechanisms and post-processes the results returned by the lower layer. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. It typically implements the front-end of the search application, the crawler and the orchestration of the different indexing and search services.
Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to this domain. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors, whatever their origin or role in the upper levels (visual or audio features, global or part-based descriptions, low or high semantic level, etc. ). The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. It typically extracts and embeds the low-level features, implements the querying mechanisms and post-processes the results returned by the lower layer. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. It typically implements the front-end of the search application, the crawler and the orchestration of the different indexing and search services.
Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. Such methods have been intensively studied in the multimedia community to allow managing the massive amount of raw multimedia documents created every day (e.g. video will account to 84% of U.S. internet traffic by 2018). Recent years have consequently witnessed a consistent growth of content-aware and multi-modal search engines deployed on massive multimedia data. Popular multimedia search applications such as Google images, Youtube, Shazam, Tineye or MusicID clearly demonstrated that the first generation of large-scale audio-visual search technologies is now mature enough to be deployed on real-world big data. All these successful applications did greatly benefit from 15 years of research on multimedia analysis and efficient content-based indexing techniques. Yet the maturity reached by the first generation of content-based search engines does not preclude an intensive research activity in the field. There is actually still a lot of hard problems to be solved before we can retrieve any information in images or sounds as easily as we do in text documents. Content-based search methods actually have to reach a finer understanding of the contents as well as a higher semantic level. This requires modeling the raw signals by more and more complex and numerous features, so that the algorithms for analyzing, indexing and searching such features have to evolve accordingly. This thesis describes several of my works related to large-scale content-based information retrieval. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors, whatever their origin or role in the upper levels (visual or audio features, global or part-based descriptions, low or high semantic level, etc. ). The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. It typically extracts and embeds the low-level features, implements the querying mechanisms and post-processes the results returned by the lower layer. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. It typically implements the front-end of the search application, the crawler and the orchestration of the different indexing and search services.
Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. This thesis describes several of my works related to large-scale content-based information retrieval. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors, whatever their origin or role in the upper levels (visual or audio features, global or part-based descriptions, low or high semantic level, etc. ). The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. It typically extracts and embeds the low-level features, implements the querying mechanisms and post-processes the results returned by the lower layer. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. It typically implements the front-end of the search application, the crawler and the orchestration of the different indexing and search services.
HDR habilitation: Large-scale Content-based Visual Information Retrieval [pdf]
HDR habilitation: Large-scale Content-based Visual Information Retrieval [pdf]
HDR habilitation on Large-scale Content-based Visual Information Retrieval [pdf]
HDR habilitation: Large-scale Content-based Visual Information Retrieval [pdf]
Large-scale Content-based Visual Information Retrieval [pdf]
HDR habilitation on Large-scale Content-based Visual Information Retrieval [pdf]
Large-scale Content-based Visual Information Retrieval [pdf]
Large-scale Content-based Visual Information Retrieval [pdf]
Large-scale Content-based Visual Information Retrieval PDF
Large-scale Content-based Visual Information Retrieval [pdf]
Large-scale Content-based Visual Information Retrieval
Large-scale Content-based Visual Information Retrieval PDF
HDR habilitation (highest French academic qualification) - defended the 26/05/2015
HDR habilitation (highest French academic qualification) - defended the 26/05/2015\\
(HDR habilitation) highest French academic qualification - defended the 26/05/2015
HDR habilitation (highest French academic qualification) - defended the 26/05/2015 Rather than restricting search to the use of metadata, content-based information retrieval methods attempt to index, search and browse digital objects by means of signatures or features describing their actual content. Such methods have been intensively studied in the multimedia community to allow managing the massive amount of raw multimedia documents created every day (e.g. video will account to 84% of U.S. internet traffic by 2018). Recent years have consequently witnessed a consistent growth of content-aware and multi-modal search engines deployed on massive multimedia data. Popular multimedia search applications such as Google images, Youtube, Shazam, Tineye or MusicID clearly demonstrated that the first generation of large-scale audio-visual search technologies is now mature enough to be deployed on real-world big data. All these successful applications did greatly benefit from 15 years of research on multimedia analysis and efficient content-based indexing techniques. Yet the maturity reached by the first generation of content-based search engines does not preclude an intensive research activity in the field. There is actually still a lot of hard problems to be solved before we can retrieve any information in images or sounds as easily as we do in text documents. Content-based search methods actually have to reach a finer understanding of the contents as well as a higher semantic level. This requires modeling the raw signals by more and more complex and numerous features, so that the algorithms for analyzing, indexing and searching such features have to evolve accordingly. This thesis describes several of my works related to large-scale content-based information retrieval. The different contributions are presented in a bottom-up fashion reflecting a typical three-tier software architecture of an end-to-end multimedia information retrieval system. The lowest layer is only concerned with managing, indexing and searching large sets of high-dimensional feature vectors, whatever their origin or role in the upper levels (visual or audio features, global or part-based descriptions, low or high semantic level, etc. ). The middle layer rather works at the document level and is in charge of analyzing, indexing and searching collections of documents. It typically extracts and embeds the low-level features, implements the querying mechanisms and post-processes the results returned by the lower layer. The upper layer works at the applicative level and is in charge of providing useful and interactive functionalities to the end-user. It typically implements the front-end of the search application, the crawler and the orchestration of the different indexing and search services.
Interactive plant identification based on social images
Interactive plant identification based on social images
Scalable Mining of Small Visual Objects
Scalable Mining of Small Visual Objects
Visual based Event Mining
Visual based Event Mining
Hash-based SVM approximation
Hash-based SVM approximation
Random Maximum Margin Hashing
Random Maximum Margin Hashing
Logo retrieval with a contrario visual query expansion
Logo retrieval with a contrario visual query expansion
Interactive objects retrieval with efficient boosting
Interactive objects retrieval with efficient boosting
Multidimensional Hashing
High-dimensional Hashing
Content-based Video Copy Detection
Content-based Video Copy Detection
Visual Local Features
Visual Local Features
- Dissociated dipoles [DIPOLES07]
Dissociated dipoles [DIPOLES07]
- Density-based selection of local features
Density-based selection of local features
Large-scale Content-based Visual Information Retrieval
Large-scale Content-based Visual Information Retrieval
Large-scale Content-based Visual Information Retrieval
Large-scale Content-based Visual Information Retrieval
Large-scale Content-based Visual Information Retrieval (HDR habilitation)
highest French academic qualification - defended the 26/05/2015
Large-scale Content-based Visual Information Retrieval
(HDR habilitation) highest French academic qualification - defended the 26/05/2015
Large-scale Content-based Visual Information Retrieval (HDR habilitation (highest French academic qualification) - defended the 26/05/2015)
Large-scale Content-based Visual Information Retrieval (HDR habilitation)
highest French academic qualification - defended the 26/05/2015
Large-scale Content-based Visual Information Retrieval (HDR habilitation (highest French academic qualification) - defended the 26/05/2015)
The data collected through this workflow is used each year since 2011 in ImageCLEF evaluation campaign where I co-organize the plant identification task:
The data collected through this workflow is used each year since 2011 in ImageCLEF evaluation campaign where I am chair of the plant id task:
The data collected through this workflow is used each year since 2011 in ImageCLEF evaluation campaign.
The data collected through this workflow is used each year since 2011 in ImageCLEF evaluation campaign where I co-organize the plant identification task:
The growing data collected through this workflow is used each year since 2011 in ImageCLEF evaluation campaign. http://www.imageclef.org/system/files/bannerImageCLEF2013PlantTaskMini.png
The data collected through this workflow is used each year since 2011 in ImageCLEF evaluation campaign. http://www.imageclef.org/system/files/bannerImageCLEF2013PlantTaskMini.png
http://www.imageclef.org/system/files/bannerImageCLEF2013PlantTaskMini.png
http://www.imageclef.org/system/files/bannerImageCLEF2013PlantTaskMini.png
http://www.imageclef.org/system/files/bannerImageCLEF2013PlantTaskMini.png
http://www.imageclef.org/system/files/bannerImageCLEF2013PlantTaskMini.png
http://www.imageclef.org/system/files/bannerImageCLEF2013PlantTaskMini.png
http://www.imageclef.org/system/files/bannerImageCLEF2013PlantTaskMini.png
The growing data collected through this workflow is used each year since 2011 in ImageCLEF evaluation campaigns.
The growing data collected through this workflow is used each year since 2011 in ImageCLEF evaluation campaign. http://www.imageclef.org/system/files/bannerImageCLEF2013PlantTaskMini.png
Initiated in the context of a citizen sciences project, the main contribution of this work is an innovative collaborative workflow focused on image-based plants identication as a mean to enlist new contributors and facilitate access to botanical data. Since 2010, hundreds of thousands of geo-tagged and dated plant photographs were collected and revised by hundreds of novice, amateur and expert botanists of a specialized social network. An image-based identication tool - available as both a web and an iPhone application - is synchronized with that growing data and allows any user to query or enrich the system with new observations. An important originality is that it works with up to five different organs (pdf) contrarily to previous approaches that mainly relied on the leaf. This allows querying the system at any period of the year and with complementary images composing a plant observation. Extensive experiments of the visual search engine show that it is already very helpful to determine a plant among hundreds or thousands of species (to appear) . At the time of writing, the whole framework covers about half of
Initiated in the context of a citizen sciences project, the main contribution of this work is an innovative collaborative workflow focused on image-based plants identication as a mean to enlist new contributors and facilitate access to botanical data. Since 2010, hundreds of thousands of geo-tagged and dated plant photographs were collected and revised by hundreds of novice, amateur and expert botanists of a specialized social network. An image-based identication tool - available as both a web and an iPhone application - is synchronized with that growing data and allows any user to query or enrich the system with new observations. An important originality is that it works with up to five different organs (pdf) contrarily to previous approaches that mainly relied on the leaf. This allows querying the system at any period of the year and with complementary images composing a plant observation. Extensive experiments of the visual search engine show that it is already very helpful to determine a plant among hundreds or thousands of species (to appear). At the time of writing, the whole framework covers about half of
The growing data collected through this workflow is used each year since 2011 in ImageCLEF evaluation campaigns.
Initiated in the context of a citizen sciences project, the main contribution of this work is an innovative collaborative workflow focused on image-based plants identication as a mean to enlist new contributors and facilitate access to botanical data. Since 2010, hundreds of thousands of geo-tagged and dated plant photographs were collected and revised by hundreds of novice, amateur and expert botanists of a specialized social network. An image-based identication tool - available as both a web and an iPhone application - is synchronized with that growing data and allows any user to query or enrich the system with new observations. An important originality is that it works with up to five different organs contrarily to previous approaches that mainly relied on the leaf. This allows querying the system at any period of the year and with complementary images composing a plant observation. Extensive experiments of the visual search engine show that it is already very helpful to determine a plant among hundreds or thousands of species (to appear) . At the time of writing, the whole framework covers about half of
Initiated in the context of a citizen sciences project, the main contribution of this work is an innovative collaborative workflow focused on image-based plants identication as a mean to enlist new contributors and facilitate access to botanical data. Since 2010, hundreds of thousands of geo-tagged and dated plant photographs were collected and revised by hundreds of novice, amateur and expert botanists of a specialized social network. An image-based identication tool - available as both a web and an iPhone application - is synchronized with that growing data and allows any user to query or enrich the system with new observations. An important originality is that it works with up to five different organs (pdf) contrarily to previous approaches that mainly relied on the leaf. This allows querying the system at any period of the year and with complementary images composing a plant observation. Extensive experiments of the visual search engine show that it is already very helpful to determine a plant among hundreds or thousands of species (to appear) . At the time of writing, the whole framework covers about half of
Speeding up the collection and integration of raw botanical observation data is a crucial step towards a sustainable development of agriculture and the conservation of biodiversity. Initiated in the context of a citizen sciences project, the main contribution of this work is an innovative collaborative workflow focused on image-based plants identication as a mean to enlist new contributors and facilitate access to botanical data. Since 2010, hundreds of thousands of geo-tagged and dated plant photographs were collected and revised by hundreds of novice, amateur and expert botanists of a specialized social network. An image-based identication tool - available as both a web and an iPhone application - is synchronized with that growing data and allows any user to query or enrich the system with new observations. An important originality is that it works with up to five different organs contrarily to previous approaches that mainly relied on the leaf. This allows querying the system at any period of the year and with complementary images composing a plant observation. Extensive experiments of the visual search engine show that it is already very helpful to determine a plant among hundreds or thousands of species (to appear) . At the time of writing, the whole framework covers about half of
Initiated in the context of a citizen sciences project, the main contribution of this work is an innovative collaborative workflow focused on image-based plants identication as a mean to enlist new contributors and facilitate access to botanical data. Since 2010, hundreds of thousands of geo-tagged and dated plant photographs were collected and revised by hundreds of novice, amateur and expert botanists of a specialized social network. An image-based identication tool - available as both a web and an iPhone application - is synchronized with that growing data and allows any user to query or enrich the system with new observations. An important originality is that it works with up to five different organs contrarily to previous approaches that mainly relied on the leaf. This allows querying the system at any period of the year and with complementary images composing a plant observation. Extensive experiments of the visual search engine show that it is already very helpful to determine a plant among hundreds or thousands of species (to appear) . At the time of writing, the whole framework covers about half of
Interactive plant identification based on social image data
Interactive plant identification based on social images
Interactive plant identification based on social image data
Speeding up the collection and integration of raw botanical observation data is a crucial step towards a sustainable development of agriculture and the conservation of biodiversity. Initiated in the context of a citizen sciences project, the main contribution of this work is an innovative collaborative workflow focused on image-based plants identication as a mean to enlist new contributors and facilitate access to botanical data. Since 2010, hundreds of thousands of geo-tagged and dated plant photographs were collected and revised by hundreds of novice, amateur and expert botanists of a specialized social network. An image-based identication tool - available as both a web and an iPhone application - is synchronized with that growing data and allows any user to query or enrich the system with new observations. An important originality is that it works with up to five different organs contrarily to previous approaches that mainly relied on the leaf. This allows querying the system at any period of the year and with complementary images composing a plant observation. Extensive experiments of the visual search engine show that it is already very helpful to determine a plant among hundreds or thousands of species (to appear) . At the time of writing, the whole framework covers about half of the plant species living in France (3500 species), which already makes it the widest existing automated identication tool.
Besides, another Phd student of mine (Riadh Trad) did work on visual-based event retrieval and discovery in social data (Flickr images). He built a new event records matching technique making use of both the visual content and the social context .
Besides, another Phd student of mine (Riadh Trad) did work on visual-based event retrieval and discovery in social data (Flickr images). He built a new event records matching technique making use of both the visual content and the social context pdf.
Automatically linking multimedia documents that contain one or several instances of the same visual object has many applications including: salient events detection, relevant patterns discovery in scientific data or simply web browsing through hyper-visual links. In this work pdf, we formally revisited the problem of mining or discovering such objects, and introduced a new hashing strategy, working first at the visual level, and then at the geometric level. Experiments conducted both on FlickrBelgaLogo dataset and on millions of images shows the efficiency of our method.
Applying this technique to web images allows to suggest trustful hyper-visual links to the user and finally allows him to browse the web in a radically new way as illustrated in this video:
Automatically linking multimedia documents that contain one or several instances of the same visual object has many applications including: salient events detection, relevant patterns discovery in scientific data or simply web browsing through hyper-visual links. In this work pdf, we formally revisited the problem of mining or discovering such objects, and introduced a new hashing strategy, working first at the visual level, and then at the geometric level. Experiments conducted both on FlickrBelgaLogo dataset and on millions of images shows the efficiency of our method. Applying this technique to web images allows to suggest trustful hyper-visual links to the user and finally allows him to browse the web in a radically new way as illustrated in this video:
Besides, another Phd student of mine (https://who.rocq.inria.fr/Mohamed.Trad/) did work on visual-based event retrieval and discovery in social data (Flickr images). He built a new event records matching technique making use of both the visual content and the social context .
Besides, another Phd student of mine (Riadh Trad) did work on visual-based event retrieval and discovery in social data (Flickr images). He built a new event records matching technique making use of both the visual content and the social context .
This method was integrated within a visual-based media event detection system in the scope of a French project called the transmedia observatory. It allows the automatic discovery of the most circulated images across the main news media (news websites, press agencies, TV news and newspapers). The main originality of the detection is to rely on the transmedia contextual information to denoise the raw visual detections and consequently focus on the most salient trans-media events. This work was presented at ACM Multimedia Grand Challenge 2012 pdf. The movie presented during this event is available here:
Our objects mining technique was integrated within a visual-based media event detection system in the scope of a French project called the transmedia observatory. It allows the automatic discovery of the most circulated images across the main news media (news websites, press agencies, TV news and newspapers). The main originality of the detection is to rely on the transmedia contextual information to denoise the raw visual detections and consequently focus on the most salient trans-media events. This work was presented at ACM Multimedia Grand Challenge 2012 pdf. The movie presented during this event is available here:
Besides, another Phd student of mine (https://who.rocq.inria.fr/Mohamed.Trad/) did work on visual-based event retrieval and discovery in social data (Flickr images). He built a new event records matching technique making use of both the visual content and the social context .
http://www.otmedia.fr/wp-content/uploads/2012/11/OTMediaPres001-300x166.jpg
http://www.otmedia.fr/wp-content/uploads/2012/11/OTMediaPres001-300x166.jpg
http://www-sop.inria.fr/members/Alexis.Joly/bieres.png
http://www-sop.inria.fr/members/Alexis.Joly/bieres.png
http://www-sop.inria.fr/members/Alexis.Joly/bieres.png
http://www-sop.inria.fr/members/Alexis.Joly/bieres.png
http://www-sop.inria.fr/members/Alexis.Joly/bieres.png
http://www-sop.inria.fr/members/Alexis.Joly/bieres.png | Small objects query suggestion in a large web-image collection , Developed within OTMedia project, accepted demo at ACM MM 201, based on the following publications of my Phd students: MTAP-2013, ACM-MM-2012 |
Applying this technique to web images allows to suggest trustful hyper-visual links to the user and finally allows him to browse the web in a radically new way as illustrated in this video:
This new search paradigm is published in MTAP-2013 and will be demonstrated at ACM MM 2013.
http://www-sop.inria.fr/members/Alexis.Joly/bieres.png | Small objects query suggestion in a large web-image collection , Developed within OTMedia project, accepted demo at ACM MM 201, based on the following publications of my Phd students: MTAP-2013, ACM-MM-2012 |
Visual based Event Mining
http://www-sop.inria.fr/members/Alexis.Joly/bieres.png | Small objects query suggestion in a large web-image collection , Developed within OTMedia project, accepted demo at ACM MM 201, based on the following publications of my Phd students: MTAP-2013, ACM-MM-2012 |
http://www-sop.inria.fr/members/Alexis.Joly/bieres.png | Small objects query suggestion in a large web-image collection , Developed within OTMedia project, accepted demo at ACM MM 201, based on the following publications of my Phd students: MTAP-2013, ACM-MM-2012 |
We addressed the problem of speeding-up the prediction phase of linear Support Vector Machines via Locality Sensitive Hashing [[http://www-sop.inria.fr/members/Alexis.Joly/bmvc_final.pdf|pdf]. Whereas the mainstream work in the field is focused on training classifiers on huge amount of data, less efforts are spent on the counterpart scalability issue: how to apply big trained models efficiently on huge non annotated collections ? In this work, we propose building space-and-time-efficient hash-based classifiers that are applied in a first stage in order to approximate the exact results and filter the hypothesis space. Experiments performed with millions of one-against-one classifiers show that the proposed hash-based classifier can be more than two orders of magnitude faster than the exact classifier with minor losses in quality.
We addressed the problem of speeding-up the prediction phase of linear Support Vector Machines via Locality Sensitive Hashing pdf. Whereas the mainstream work in the field is focused on training classifiers on huge amount of data, less efforts are spent on the counterpart scalability issue: how to apply big trained models efficiently on huge non annotated collections ? In this work, we propose building space-and-time-efficient hash-based classifiers that are applied in a first stage in order to approximate the exact results and filter the hypothesis space. Experiments performed with millions of one-against-one classifiers show that the proposed hash-based classifier can be more than two orders of magnitude faster than the exact classifier with minor losses in quality.
High-dimensional data hashing
High dimensional data hashing is essential for scaling up and distributing data analysis applications involving feature-rich objects, such as text documents, images or multi-modal entities (scientific observations, events, etc.). We recently investigated the use of high dimensional hashing methods for efficiently approximating K-NN graphs pdf, particularly in distributed environments. We highlighted the importance of balancing issues on the performance of such approaches and show why the baseline approach using Locality Sensitive Hashing does not perform well. Our new KNN-join method is based on RMMH, a hash function family based on randomly trained classifiers that we introduced in 2011. We show that the resulting hash tables are much more balanced and that the number of resulting collisions can be greatly reduced without degrading quality. We further improve the load balancing of our distributed approach by designing a parallelized local join algorithm, implemented within the MapReduce framework. In another work [[http://www-sop.inria.fr/members/Alexis.Joly/bmvc_final.pdf|pdf], we addressed the problem of speeding-up the prediction phase of linear Support Vector Machines via Locality Sensitive Hashing. Whereas the mainstream work in the field is focused on training classifiers on huge amount of data, less efforts are spent on the counterpart scalability issue: how to apply big trained models efficiently on huge non annotated collections ? In this work, we propose building efficient hash-based classifiers that are applied in a first stage in order to approximate the exact results and filter the hypothesis space. Experiments performed with millions of one-against-one classifiers show that the proposed hash-based classifier can be more than two orders of magnitude faster than the exact classifier with minor losses in quality.
Hash-based SVM approximation
We addressed the problem of speeding-up the prediction phase of linear Support Vector Machines via Locality Sensitive Hashing [[http://www-sop.inria.fr/members/Alexis.Joly/bmvc_final.pdf|pdf]. Whereas the mainstream work in the field is focused on training classifiers on huge amount of data, less efforts are spent on the counterpart scalability issue: how to apply big trained models efficiently on huge non annotated collections ? In this work, we propose building space-and-time-efficient hash-based classifiers that are applied in a first stage in order to approximate the exact results and filter the hypothesis space. Experiments performed with millions of one-against-one classifiers show that the proposed hash-based classifier can be more than two orders of magnitude faster than the exact classifier with minor losses in quality.
RMMH is a new hashing function aimed at embedding high dimensional feature spaces in compact and indexable hash codes. Several data dependent hash functions have been proposed recently to closely fit data distribution and provide better selectivity than usual random projections such as LSH. However, improvements occur only for relatively small hash code sizes up to 64 or 128 bits. As discussed in the paper, this is mainly due to the lack of independence between the produced hash functions. RMMH attempts to solve this issue in any kernel space. Rather than boosting the collision probability of close points, our method focus on data scattering. By training purely random splits of the data, regardless the closeness of the training samples, it is indeed possible to generate consistently more independent hash functions. On the other side, the use of large margin classifiers allows to maintain good generalization performances. Experiments show that our new Random Maximum Margin Hashing scheme (RMMH) outperforms four state-of-the-art hashing methods, notably in kernel spaces.
RMMH is a new hashing function aimed at embedding high dimensional feature spaces in compact and indexable hash codes. Several data dependent hash functions have been proposed recently to closely fit data distribution and provide better selectivity than usual random projections such as LSH. However, improvements occur only for relatively small hash code sizes up to 64 or 128 bits. As discussed in the paper, this is mainly due to the lack of independence between the produced hash functions. RMMH attempts to solve this issue in any kernel space. Rather than boosting the collision probability of close points, our method focus on data scattering. By training purely random splits of the data, regardless the closeness of the training samples, it is indeed possible to generate consistently more independent hash functions. On the other side, the use of large margin classifiers allows to maintain good generalization performances. Experiments show that our new Random Maximum Margin Hashing scheme (RMMH) outperforms four state-of-the-art hashing methods, notably in kernel spaces.
We recently investigated the use of RMMH for efficiently approximating K-NN graphs pdf, particularly in distributed environments. We highlighted the importance of balancing issues on the performance of such approaches and show why the baseline approach using Locality Sensitive Hashing does not perform well.
High-dimensional data hashing
High dimensional data hashing is essential for scaling up and distributing data analysis applications involving feature-rich objects, such as text documents, images or multi-modal entities (scientific observations, events, etc.). We recently investigated the use of high dimensional hashing methods for efficiently approximating K-NN graphs pdf, particularly in distributed environments. We highlighted the importance of balancing issues on the performance of such approaches and show why the baseline approach using Locality Sensitive Hashing does not perform well. Our new KNN-join method is based on RMMH, a hash function family based on randomly trained classifiers that we introduced in 2011. We show that the resulting hash tables are much more balanced and that the number of resulting collisions can be greatly reduced without degrading quality. We further improve the load balancing of our distributed approach by designing a parallelized local join algorithm, implemented within the MapReduce framework. In another work [[http://www-sop.inria.fr/members/Alexis.Joly/bmvc_final.pdf|pdf], we addressed the problem of speeding-up the prediction phase of linear Support Vector Machines via Locality Sensitive Hashing. Whereas the mainstream work in the field is focused on training classifiers on huge amount of data, less efforts are spent on the counterpart scalability issue: how to apply big trained models efficiently on huge non annotated collections ? In this work, we propose building efficient hash-based classifiers that are applied in a first stage in order to approximate the exact results and filter the hypothesis space. Experiments performed with millions of one-against-one classifiers show that the proposed hash-based classifier can be more than two orders of magnitude faster than the exact classifier with minor losses in quality.
Scalable Mining of Visual Objects
Scalable Mining of Small Visual Objects
Automatically linking multimedia documents that contain one or several instances of the same visual object has many applications including: salient events detection, relevant patterns discovery in scientific data or simply web browsing through hyper-visual links. Whereas efficient methods now exist for searching rigid objects in large collections, discovering them from scratch is still challenging in terms of scalability, particularly when the targeted objects are rather small. In this work pdf, we formally revisited the problem of mining or discovering such objects, and then generalized two kinds of existing methods for probing candidate object seeds: weighted adaptive sampling and hashing based methods. We then introduce a new hashing strategy, working first at the visual level, and then at the geometric level. Experiments conducted on millions of images show that our method outperforms state-of-the-art.\\
Automatically linking multimedia documents that contain one or several instances of the same visual object has many applications including: salient events detection, relevant patterns discovery in scientific data or simply web browsing through hyper-visual links. In this work pdf, we formally revisited the problem of mining or discovering such objects, and introduced a new hashing strategy, working first at the visual level, and then at the geometric level. Experiments conducted both on FlickrBelgaLogo dataset and on millions of images shows the efficiency of our method.\\
Scalable Mining of Small Visual Objects
Scalable Mining of Visual Objects
This method was integrated within a visual-based media event detection system in the scope of a French project called the transmedia observatory. It allows the automatic discovery of the most circulated images across the main news media (news websites, press agencies, TV news and newspapers). The main originality of the detection is to rely on the transmedia contextual information to denoise the raw visual detections and consequently focus on the most salient trans-media events. This work was presented at ACM Multimedia Grand Challenge 2012 \cite{}. The movie presented during this event is available here:
This method was integrated within a visual-based media event detection system in the scope of a French project called the transmedia observatory. It allows the automatic discovery of the most circulated images across the main news media (news websites, press agencies, TV news and newspapers). The main originality of the detection is to rely on the transmedia contextual information to denoise the raw visual detections and consequently focus on the most salient trans-media events. This work was presented at ACM Multimedia Grand Challenge 2012 pdf. The movie presented during this event is available here:
http://www.otmedia.fr/wp-content/uploads/2012/11/OTMediaPres001-300x166.jpg
Automatically linking multimedia documents that contain one or several instances of the same visual object has many applications including: salient events detection, relevant patterns discovery in scientific data or simply web browsing through hyper-visual links. Whereas efficient methods now exist for searching rigid objects in large collections, discovering them from scratch is still challenging in terms of scalability, particularly when the targeted objects are rather small. In this work \cite{letessier:hal-00739735}, we formally revisited the problem of mining or discovering such objects, and then generalized two kinds of existing methods for probing candidate object seeds: weighted adaptive sampling and hashing based methods. We then introduce a new hashing strategy, working first at the visual level, and then at the geometric level. Experiments conducted on millions of images show that our method outperforms state-of-the-art.
This method was integrated within a visual-based media event detection system in the scope of a French project called the transmedia observatory. It allows the automatic discovery of the most circulated images across the main news media (news websites, press agencies, TV news and newspapers). The main originality of the detection is to rely on the transmedia contextual information to denoise the raw visual detections and consequently focus on the most salient trans-media events. This work was presented at ACM Multimedia Grand Challenge 2012 \cite{}. The movie presented during this event is available at \url{http://www.otmedia.fr/?p=217}.
Automatically linking multimedia documents that contain one or several instances of the same visual object has many applications including: salient events detection, relevant patterns discovery in scientific data or simply web browsing through hyper-visual links. Whereas efficient methods now exist for searching rigid objects in large collections, discovering them from scratch is still challenging in terms of scalability, particularly when the targeted objects are rather small. In this work pdf, we formally revisited the problem of mining or discovering such objects, and then generalized two kinds of existing methods for probing candidate object seeds: weighted adaptive sampling and hashing based methods. We then introduce a new hashing strategy, working first at the visual level, and then at the geometric level. Experiments conducted on millions of images show that our method outperforms state-of-the-art.
This method was integrated within a visual-based media event detection system in the scope of a French project called the transmedia observatory. It allows the automatic discovery of the most circulated images across the main news media (news websites, press agencies, TV news and newspapers). The main originality of the detection is to rely on the transmedia contextual information to denoise the raw visual detections and consequently focus on the most salient trans-media events. This work was presented at ACM Multimedia Grand Challenge 2012 \cite{}. The movie presented during this event is available here:
Scalable Mining of Small Visual Objects
Automatically linking multimedia documents that contain one or several instances of the same visual object has many applications including: salient events detection, relevant patterns discovery in scientific data or simply web browsing through hyper-visual links. Whereas efficient methods now exist for searching rigid objects in large collections, discovering them from scratch is still challenging in terms of scalability, particularly when the targeted objects are rather small. In this work \cite{letessier:hal-00739735}, we formally revisited the problem of mining or discovering such objects, and then generalized two kinds of existing methods for probing candidate object seeds: weighted adaptive sampling and hashing based methods. We then introduce a new hashing strategy, working first at the visual level, and then at the geometric level. Experiments conducted on millions of images show that our method outperforms state-of-the-art.
This method was integrated within a visual-based media event detection system in the scope of a French project called the transmedia observatory. It allows the automatic discovery of the most circulated images across the main news media (news websites, press agencies, TV news and newspapers). The main originality of the detection is to rely on the transmedia contextual information to denoise the raw visual detections and consequently focus on the most salient trans-media events. This work was presented at ACM Multimedia Grand Challenge 2012 \cite{}. The movie presented during this event is available at \url{http://www.otmedia.fr/?p=217}.
http://www-rocq.inria.fr/~ajoly/quicktime.jpeg | http://www-sop.inria.fr/members/Alexis.Joly/realplayer.jpeg | http://www-sop.inria.fr/members/Alexis.Joly/dailymotion.jpeg |
http://www-sop.inria.fr/members/Alexis.Joly/quicktime.jpeg | http://www-sop.inria.fr/members/Alexis.Joly/realplayer.jpeg | http://www-sop.inria.fr/members/Alexis.Joly/dailymotion.jpeg |
We developed, jointly with Olivier Buisson at INA, a new similarity search structure [ACM08AMP-LSH] dedicated to high dimensional features. It is built on the well-known LSH technique, but in order to reduce memory usage, it intelligently probes multiple buckets that are likely to contain query results in a hash table. This method is somehow inspired by our previous works on space-filling curve based hashing (see this one or this one? for example) and improves upon
We developed, jointly with Olivier Buisson at INA, a new similarity search structure [ACM08AMP-LSH] dedicated to high dimensional features. It is built on the well-known LSH technique, but in order to reduce memory usage, it intelligently probes multiple buckets that are likely to contain query results in a hash table. This method is somehow inspired by our previous works on space-filling curve based hashing (see this one or this one for example) and improves upon
I am currently less involved on research issues about CBCD but still strongly implied in benchmarking (more info here).
I am currently less involved on research issues about CBCD but still strongly implied in benchmarking (more info here).
http://www-rocq.inria.fr/~ajoly/quicktime.jpeg | http://www-sop.inria.fr/members/Alexis.Joly/realplayer.jpeg | http://www-rocq.inria.fr/~ajoly/dailymotion.jpeg |
http://www-rocq.inria.fr/~ajoly/quicktime.jpeg | http://www-sop.inria.fr/members/Alexis.Joly/realplayer.jpeg | http://www-sop.inria.fr/members/Alexis.Joly/dailymotion.jpeg |
http://www-rocq.inria.fr/~ajoly/watch.jpg
http://www-sop.inria.fr/members/Alexis.Joly/watch.jpg
http://www-rocq.inria.fr/~ajoly/dipoles.jpg
http://www-sop.inria.fr/members/Alexis.Joly/dipoles.jpg
http://www-rocq.inria.fr/~ajoly/images/femme_harris648.jpg http://www-rocq.inria.fr/~ajoly/images/femme_rares648.jpg
http://www-sop.inria.fr/members/Alexis.Joly/images/femme_harris648.jpg http://www-sop.inria.fr/members/Alexis.Joly/images/femme_rares648.jpg
We developed, jointly with Olivier Buisson at INA, a new similarity search structure [ACM08AMP-LSH] dedicated to high dimensional features. It is built on the well-known LSH technique, but in order to reduce memory usage, it intelligently probes multiple buckets that are likely to contain query results in a hash table. This method is somehow inspired by our previous works on space-filling curve based hashing (see this one or this one for example) and improves upon
We developed, jointly with Olivier Buisson at INA, a new similarity search structure [ACM08AMP-LSH] dedicated to high dimensional features. It is built on the well-known LSH technique, but in order to reduce memory usage, it intelligently probes multiple buckets that are likely to contain query results in a hash table. This method is somehow inspired by our previous works on space-filling curve based hashing (see this one or this one? for example) and improves upon
The last research paper somehow summarizing my work on this topic: [TMA07]
The last research paper somehow summarizing my work on this topic: [TMA07]
http://www-rocq.inria.fr/~ajoly/quicktime.jpeg | http://www-rocq.inria.fr/~ajoly/realplayer.jpeg | http://www-rocq.inria.fr/~ajoly/dailymotion.jpeg |
http://www-rocq.inria.fr/~ajoly/quicktime.jpeg | http://www-sop.inria.fr/members/Alexis.Joly/realplayer.jpeg | http://www-rocq.inria.fr/~ajoly/dailymotion.jpeg |
I did work on logo retrieval within VITALAS European project as an application of large scale local features matching. I introduced a new visual query expansion method using an a contrario thresholding strategy in order to improve the accuracy of expanded query images [ACM09logo]. I also created a new challenging dataset, called BelgaLogos, which was created in collaboration with professionals of a press agency, in order to evaluate logo retrieval technologies in real-world scenarios.
I did work on logo retrieval within VITALAS European project as an application of large scale local features matching. I introduced a new visual query expansion method using an a contrario thresholding strategy in order to improve the accuracy of expanded query images [ACM09logo]. I also created a new challenging dataset, called BelgaLogos, which was created in collaboration with professionals of a press agency, in order to evaluate logo retrieval technologies in real-world scenarios.
http://www-rocq.inria.fr/~ajoly/boosting.jpg http://www-rocq.inria.fr/~ajoly/boosting1.jpg
http://www-sop.inria.fr/members/Alexis.Joly/boosting.jpg http://www-sop.inria.fr/members/Alexis.Joly/boosting1.jpg
Visit BelgaLogos home page to get the evaluation dataset
Visit BelgaLogos home page to get the evaluation dataset
Visit BelgaLogos home page to get the evaluation dataset
Visit BelgaLogos home page to get the evaluation dataset
http://www-rocq.inria.fr/~ajoly/vitalas-coca.jpg
Visit BelgaLogos home page to get the evaluation dataset
http://www-sop.inria.fr/members/Alexis.Joly/vitalas-coca.jpg
Visit BelgaLogos home page to get the evaluation dataset
http://deuxalex.free.fr/rmmh-fig.jpg
http://deuxalex.free.fr/rmmh-fig.jpg
http://deuxalex.free.fr/rmmh-fig.jpg
http://deuxalex.free.fr/rmmh-fig.jpg
http://deuxalex.free.fr/rmmh-fig.jpg
http://deuxalex.free.fr/rmmh-fig.jpg
Random Maximum Margin Hashing
RMMH is a new hashing function aimed at embedding high dimensional feature spaces in compact and indexable hash codes. Several data dependent hash functions have been proposed recently to closely fit data distribution and provide better selectivity than usual random projections such as LSH. However, improvements occur only for relatively small hash code sizes up to 64 or 128 bits. As discussed in the paper, this is mainly due to the lack of independence between the produced hash functions. RMMH attempts to solve this issue in any kernel space. Rather than boosting the collision probability of close points, our method focus on data scattering. By training purely random splits of the data, regardless the closeness of the training samples, it is indeed possible to generate consistently more independent hash functions. On the other side, the use of large margin classifiers allows to maintain good generalization performances. Experiments show that our new Random Maximum Margin Hashing scheme (RMMH) outperforms four state-of-the-art hashing methods, notably in kernel spaces.
http://www-rocq.inria.fr/~ajoly/hashing.jpg
http://www-rocq.inria.fr/~ajoly/hashing.jpg
http://www-rocq.inria.fr/~ajoly/hashing.jpg
We developed, jointly with Olivier Buisson at INA, a new similarity search structure [ACM08AMP-LSH] dedicated to high dimensional features. It is built on the well-known LSH technique, but in order to reduce memory usage, it intelligently probes multiple buckets that are likely to contain query results in a hash table. This method is inspired by our previous works on hashing (see this one or this one for example) and improves upon
We developed, jointly with Olivier Buisson at INA, a new similarity search structure [ACM08AMP-LSH] dedicated to high dimensional features. It is built on the well-known LSH technique, but in order to reduce memory usage, it intelligently probes multiple buckets that are likely to contain query results in a hash table. This method is somehow inspired by our previous works on space-filling curve based hashing (see this one or this one for example) and improves upon
We hope providing an open source version in the next few months...
We developed, jointly with Olivier Buisson at INA, a new similarity search structure [ACM08AMP-LSH] dedicated to high dimensional features. Multi-probe LSH methods are built on the well-known LSH technique, but they intelligently probe multiple buckets that are likely to contain query results in a hash table. Our method is inspired by our previous works on hashing (see this one or this one for example) and improves upon
We developed, jointly with Olivier Buisson at INA, a new similarity search structure [ACM08AMP-LSH] dedicated to high dimensional features. It is built on the well-known LSH technique, but in order to reduce memory usage, it intelligently probes multiple buckets that are likely to contain query results in a hash table. This method is inspired by our previous works on hashing (see this one or this one for example) and improves upon
LSH. Whereas these methods are based on likelihood criteria that a given bucket contains query results, we define a more reliable a posteriori model taking account some prior about the queries and the searched objects. This prior knowledge allows a better quality control of the search and a more accurate selection of the most probable buckets. We implemented a nearest neighbors search based on this paradigm and performed experiments on different real visual features datasets. We show that our a posteriori scheme outperforms other multi-probe LSH while offering a better quality control. Comparisons to the basic LSH technique show that our method allows consistent improvements both in space and time efficiency.
LSH.
Multi-probe locality sensitive hashing
We developed, jointly with Olivier Buisson at INA, a new similarity search structure [ACM08AMP-LSH] dedicated to high dimensional features. Multi-probe LSH is built on the well-known LSH technique, but it intelligently probes multiple buckets that are likely to contain query results in a hash table. Our method is inspired by our previous work on probabilistic similarity search structures and improves upon
Multidimensional Hashing
We developed, jointly with Olivier Buisson at INA, a new similarity search structure [ACM08AMP-LSH] dedicated to high dimensional features. Multi-probe LSH methods are built on the well-known LSH technique, but they intelligently probe multiple buckets that are likely to contain query results in a hash table. Our method is inspired by our previous works on hashing (see this one or this one for example) and improves upon
Density-based selection of local features [MIR05]
Keywords: image retrieval, local features, discriminant, density estimation\\
- Density-based selection of local features
Local features are well-suited to content-based image retrieval because of their locality, their local uniqueness and their high information content [4]. However, as they are selected only according to the local information content in the image, there is no guaranty that they will be distinctive in a large set of images. A local feature corresponding to a high saliency in the image can be highly redundant in some specific databases, such as the TV news database stored at NII in which textual characters are extremely frequent. To overcome this issue, we propose [5] to select relevant local features directly according to their discrimination power in a specific set of images. By computing the density of the local features in a source database with a new fast non parametric density estimation technique, it is indeed possible to select quickly the most rare local features in a large set of images. Figure illustrates the difference between the 20 most salient points of an image and the 20 most rare points according to their density in a large image database. Currently, we are also looking at selecting local features according to their density in a single image or in a class of images, as done for textual features with TF/IDF techniques.
As common local features are selected only according to the local information content in the image, there is no guaranty that they will be distinctive in a large set of images. To overcome this issue, I introduced a new method [MIR05] to select relevant local features directly according to their discrimination power in a specific set of images. By computing the density of the local features in a source database with an efficient non parametric density estimation technique, it is indeed possible to select quickly the most rare local features in a large set of images.
left: 20 most salient points - right: 20 most rare points
[1] "Selection of Scale-Invariant Parts for Object Class Recognition", G. Dorko, C. Schmid, IEEE Int. Conf. on Computer Vision, vol. 1, pp. 634--640, 2003.
[2] "Distinctive image features from scale-invariant keypoints", D. Lowe, Int. Journal of Computer Vision, vol. 60, no. 2, pp. 91--110, 2004.
[3] "Content-based video copy detection in large databases: A local fingerprints statistical similarity search approach", A. Joly, C. Frélicot and O. Buisson, in Proceedings of the Int. Conf. on Image Processing, 2005.
[4] K. Mikolajczyk, C. Schmid. "A performance evaluation of local descriptors," cvpr, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 17, no. 10, pp. 1615--1630, 2005.
[5] "Discriminant Local Features Selection using Efficient Density Estimation in a Large Database", A. Joly and O. Buisson, ACM Int. Workshop on Multimedia Information Retrieval, invited paper, 2005.
left: 20 most salient points - right: 20 most rare points
Multi-probe locality sensitive hashing [ACM08]
We developed, jointly with Olivier Buisson at INA, a new similarity search structure dedicated to high dimensional features. Multi-probe LSH is built on the well-known LSH technique, but it intelligently probes multiple buckets that are likely to contain query results in a hash table. Our method is inspired by our previous work on probabilistic similarity search structures and improves upon
Multi-probe locality sensitive hashing
We developed, jointly with Olivier Buisson at INA, a new similarity search structure [ACM08AMP-LSH] dedicated to high dimensional features. Multi-probe LSH is built on the well-known LSH technique, but it intelligently probes multiple buckets that are likely to contain query results in a hash table. Our method is inspired by our previous work on probabilistic similarity search structures and improves upon
http://www-rocq.inria.fr/~ajoly/boosting.jpg
http://www-rocq.inria.fr/~ajoly/boosting.jpg http://www-rocq.inria.fr/~ajoly/boosting1.jpg
http://www-rocq.inria.fr/~ajoly/boosting.jpg
http://www-rocq.inria.fr/~ajoly/boosting.jpg
http://www-rocq.inria.fr/~ajoly/boosting.jpg
Interactive objects retrieval with efficient boosting [ACM09]
I developed jointly with my PhD student Saloua Litayem
Interactive objects retrieval with efficient boosting
We developed jointly with my Phd student Saloua Litayem an efficient boosting method [ACM09boosting] to predict local feature based trained classifiers in sublinear time. This technique allows online relevance feedback or active learning on image regions.
Take a look the the flash demo:
Take a look to the flash demo:
NEW!! Visit BelgaLogos home page
Visit BelgaLogos home page to get the evaluation dataset
Take a look the the video demo: http://www-rocq.inria.fr/~ajoly/vitalas-coca.jpg
Take a look the the flash demo: http://www-rocq.inria.fr/~ajoly/vitalas-coca.jpg
http://www-rocq.inria.fr/~ajoly/vitalas-coca.jpg
http://www-rocq.inria.fr/~ajoly/vitalas-coca.jpg
Look at the video demo http://www-rocq.inria.fr/~ajoly/vitalas-coca.jpg
Take a look the the video demo: http://www-rocq.inria.fr/~ajoly/vitalas-coca.jpg
Look at the video demo http://www-rocq.inria.fr/~ajoly/vitalas-coca.jpg
I did work on logo retrieval within VITALAS European project as an application of large scale local features matching. I introduced a new visual query expansion method using an a contrario thresholding strategy in order to improve the accuracy of expanded query images [ACM09logo]. I also created a new challenging dataset, called BelgaLogos, which was created in collaboration with professionals of a press agency, in order to evaluate logo retrieval technologies in real-world scenarios.
I did work on logo retrieval within VITALAS European project as an application of large scale local features matching. I introduced a new visual query expansion method using an a contrario thresholding strategy in order to improve the accuracy of expanded query images [ACM09logo]. I also created a new challenging dataset, called BelgaLogos, which was created in collaboration with professionals of a press agency, in order to evaluate logo retrieval technologies in real-world scenarios.
I did work on logo retrieval within VITALAS project as an application of large scale local features matching. I introduced a new visual query expansion method using an a contrario thresholding strategy in order to improve the accuracy of expanded query images [ACM09logo]. I also created a new challenging dataset, called BelgaLogos, which was created in collaboration with professionals of a press agency, in order to evaluate logo retrieval technologies in real-world scenarios.
I did work on logo retrieval within VITALAS European project as an application of large scale local features matching. I introduced a new visual query expansion method using an a contrario thresholding strategy in order to improve the accuracy of expanded query images [ACM09logo]. I also created a new challenging dataset, called BelgaLogos, which was created in collaboration with professionals of a press agency, in order to evaluate logo retrieval technologies in real-world scenarios.
Logo retrieval with a contrario visual query expansion [ACM09]
Logo retrieval with a contrario visual query expansion
I did work on logo retrieval within VITALAS project as an application of large scale local features matching. I introduced a new visual query expansion method using an a contrario thresholding strategy in order to improve the accuracy of expanded query images [ACM09logo]. I also created a new challenging dataset, called BelgaLogos, which was created in collaboration with professionals of a press agency, in order to evaluate logo retrieval technologies in real-world scenarios.
I developed jointly with my PhD student Saloua Litayem
Logo retrieval with a contrario visual query expansion [ACM09]
Logo retrieval with a contrario visual query expansion [ACM09]
http://www-rocq.inria.fr/~ajoly/arbre.jpeg
NEW!! Visit [http://www-roc.inria.fr/imedia/belga-logo.html|BelgaLogos] home page
NEW!! Visit BelgaLogos home page
NEW!! Go to BelgaLogos home page
NEW!! Visit [http://www-roc.inria.fr/imedia/belga-logo.html|BelgaLogos] home page
http://www-rocq.inria.fr/~ajoly/arbre.jpeg
My My Phd at INA is hopefully also still of interest ;-)
Density-based selection of local features [MIR05]
Keywords: image retrieval, local features, discriminant, density estimation
This work started in collaboration with the NII (National Institute of Japan) within the scope of my visit in Tokyo (july 2005).
Local features are well-suited to content-based image retrieval because of their locality, their local uniqueness and their high information content [4]. However, as they are selected only according to the local information content in the image, there is no guaranty that they will be distinctive in a large set of images. A local feature corresponding to a high saliency in the image can be highly redundant in some specific databases, such as the TV news database stored at NII in which textual characters are extremely frequent. To overcome this issue, we propose [5] to select relevant local features directly according to their discrimination power in a specific set of images. By computing the density of the local features in a source database with a new fast non parametric density estimation technique, it is indeed possible to select quickly the most rare local features in a large set of images. Figure illustrates the difference between the 20 most salient points of an image and the 20 most rare points according to their density in a large image database. Currently, we are also looking at selecting local features according to their density in a single image or in a class of images, as done for textual features with TF/IDF techniques.
http://www-rocq.inria.fr/~ajoly/images/femme_harris648.jpg http://www-rocq.inria.fr/~ajoly/images/femme_rares648.jpg left: 20 most salient points - right: 20 most rare points
[1] "Selection of Scale-Invariant Parts for Object Class Recognition", G. Dorko, C. Schmid, IEEE Int. Conf. on Computer Vision, vol. 1, pp. 634--640, 2003.
[2] "Distinctive image features from scale-invariant keypoints", D. Lowe, Int. Journal of Computer Vision, vol. 60, no. 2, pp. 91--110, 2004.
[3] "Content-based video copy detection in large databases: A local fingerprints statistical similarity search approach", A. Joly, C. Frélicot and O. Buisson, in Proceedings of the Int. Conf. on Image Processing, 2005.
[4] K. Mikolajczyk, C. Schmid. "A performance evaluation of local descriptors," cvpr, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 17, no. 10, pp. 1615--1630, 2005.
[5] "Discriminant Local Features Selection using Efficient Density Estimation in a Large Database", A. Joly and O. Buisson, ACM Int. Workshop on Multimedia Information Retrieval, invited paper, 2005.
Logo retrieval with a contrario visual query expansion [ACM09]
NEW!! Go to BelgaLogos home page
Interactive objects retrieval with efficient boosting [ACM09]
Geometric consistency of local descriptors
Enhancing the performance of local features by using their geometric distribution or their relative positions is still a challenge. We have shown that in the copy detection scenario, the robust estimation of a global geometric transformation model after the search is widely profitable to improve the discrimination of the detection. However, for other scenarios, using the geometry remains a challenging task: Including the geometric distribution in the descriptor itself often leads to a lake of robustness during the search of similar local features whereas post-processing techniques are generally highly time consuming and thus limited to very small data sets. Moreover, in most of them, the geometric consistency is limited to rigid transformation models which do not allow to enforce the matching when two geometric distributions are dependent but not linearely linked. We are currently investigating the use of non parametric geometric consistency measurements such as mutual information and robust correlation ratio and we plane to combine them with some robust local geometric properties that could be included in the descriptor itself in order to limit the number of matches during the second step.
Multi-probe locality sensitive hashing [ACM08]
We developed, jointly with Olivier Buisson at INA, a new similarity search structure dedicated to high dimensional features. Multi-probe LSH is built on the well-known LSH technique, but it intelligently probes multiple buckets that are likely to contain query results in a hash table. Our method is inspired by our previous work on probabilistic similarity search structures and improves upon recent theoretical work on multi-probe and query adaptive LSH. Whereas these methods are based on likelihood criteria that a given bucket contains query results, we define a more reliable a posteriori model taking account some prior about the queries and the searched objects. This prior knowledge allows a better quality control of the search and a more accurate selection of the most probable buckets. We implemented a nearest neighbors search based on this paradigm and performed experiments on different real visual features datasets. We show that our a posteriori scheme outperforms other multi-probe LSH while offering a better quality control. Comparisons to the basic LSH technique show that our method allows consistent improvements both in space and time efficiency.
Multi-probe locality sensitive hashing [ACM08]
We developed, jointly with Olivier Buisson at INA, a new similarity search structure dedicated to high dimensional features. Multi-probe LSH is built on the well-known LSH technique, but it intelligently probes multiple buckets that are likely to contain query results in a hash table. Our method is inspired by our previous work on probabilistic similarity search structures and improves upon recent theoretical work on multi-probe and query adaptive LSH. Whereas these methods are based on likelihood criteria that a given bucket contains query results, we define a more reliable a posteriori model taking account some prior about the queries and the searched objects. This prior knowledge allows a better quality control of the search and a more accurate selection of the most probable buckets. We implemented a nearest neighbors search based on this paradigm and performed experiments on different real visual features datasets. We show that our a posteriori scheme outperforms other multi-probe LSH while offering a better quality control. Comparisons to the basic LSH technique show that our method allows consistent improvements both in space and time efficiency.