Modeling visual knowledge from large-scale data

Thoth is a joint team of Inria and Laboratoire Jean Kuntzmann, and started in January 2016. It is a follow up to the LEAR team (2003-2015).

Thoth is motivated by today's context in which the quantity of digital images and videos available on-line continues to grow at a phenomenal speed. The main objectives of the team are: (i) designing and learning structured models capable of representing this visual information; (ii) learning visual models from minimal supervision or unstructured meta-data; and (iii) large-scale learning and optimization. An additional focus of Thoth is on collection of appropriate datasets and design of accompanying evaluation protocols.

For more information see our research description page, and annual reports of 2015, 2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003.

Highlights

2016

Facebook logo THOTH is one of the recipients of a hardware donation in the Facebook AI Research Partnership Program.
ERC logo Julien Mairal was awarded one of the ERC starting grants 2016.
Cordelia Schmid was awarded the Longuet-Higgins Prize at CVPR 2016 for the paper co-authored with Svetlana Lazebnik (University of Illinois at Urbana-Champaign) and Jean Ponce (ENS Paris/Inria) entitled "Beyond bags of features: spatial pyramid matching for recognizing natural scene categories".
We organized a Symposium on Computer Vision and Deep Learning on June 9th. Check the program and slides.
Cordelia Schmid has received the Humboldt Research Award, granted by the Alexander von Humboldt Foundation. More details here.
Robotics Institute, CMU Our new associate team GAYA in collaboration with the Robotics Institute, CMU has started.
ICCV'15 logo Our recent papers in major conferences (3 NIPS'16, 1 ICML'16, 2 ECCV'16, 1 CVPR'16 papers) are available on our publications page.

2015

We obtained top ranked results in the VOT-TIR track of the visual object tracking challenge 2015. For more details see the competition results summary.
INRIA Grenoble has been selected as an NVIDIA GPU Research Center. For more details see NVIDIA academic collaboration.
Ramazan Gokberk Cinbis (PhD, 2014) was awarded the 2014 AFRIF thesis prize for his thesis entitled "Fisher kernel based models for image classification and object localization". He was supervised by Jakob Verbeek and Cordelia Schmid. More details at AFRIF laureats.
Navneet Dalal (PhD, 2006) and Bill Triggs, two former members of the team, were awarded the Longuet-Higgins Prize for their paper entitled "Histograms of Oriented Gradients for Human Detection" (CVPR 2005 paper). More details at awards CVPR'15.
Allegro logo We organized a 2-day workshop at Inria Grenoble. The program and the slides from the talks are available online.
ICCV'15 logo Our recent papers in major conferences (3 CVPR'15 papers, 8 ICCV'15 papers, 1 COLT'15 paper and 2 NIPS'15 papers) are available on our publications page.

2014

Thumos logo We obtained top ranked results in the localization track of the Thumos 2014 Action Recognition Challenge. The goal of the challenge is to evaluate large-scale action recognition in natural settings. The dataset used is the UCF101 dataset, which is currently the largest action dataset both in terms of number of categories and clips, with more than 13000 clips drawn from 101 action classes. This year special attention was paid to classification of uncropped videos, where the action of interest appears in videos that contain also non-relevant sections.
Cordelia Schmid was awarded the Longuet-Higgins Prize (for the 2nd time) in 2014 for her CVPR paper co-authored with Krystian Mikolajczyk entitled "A performance evaluation of local descriptors" (extended TPAMI version). More details at awards CVPR'14.
Allegro logo We organized a 2-day workshop on "Weakly Supervised Learning and Video Recognition" at Inria Grenoble. The program and the slides from the talks are available online.
ECCV'14 logo Our recent papers in major conferences (5 CVPR'14 papers, 5 ECCV'14 papers, 2 ICML'14 papers and 1 NIPS'14 paper) are available on our publications page.

2013

Thumos logo We obtained top ranked results in the Thumos 2013 Action Recognition Challenge. The goal of the challenge is to evaluate large-scale action recognition in natural settings. The dataset used is the newly released UCF101 dataset, which is currently the largest action dataset both in terms of number of categories and clips, with more than 13000 clips drawn from 101 action classes.
ICCV'13 logo Our recent papers in major conferences (3 CVPR'13 papers, 9 ICCV'13 papers, 1 NIPS'13, and 1 ICML'13 papers) are available on our publications page.
TRECVID logo LEAR participated together with the AXES project to the TRECVID MED 2013 challenge, and finished in first position. The Multimedia Event Detection (MED) evaluation track is part of the TRECVID Evaluation. The goal of MED is to assemble core detection technologies into a system that can search multimedia recordings for user-defined events based on pre-computed metadata.
NIPS We are co-organizing the workshop on Greedy Algorithms, Frank-Wolfe and Friends - A modern perspective as part of NIPS 2013, Lake Tahoe, Nevada, USA, December 10, 2013.
View of Paris The fourth INRIA Visual Recognition and Machine Learning Summer School took place at the Ecole Normale Superieure campus in Paris, from July 22 to 26, 2013.

2012

ERC logo Cordelia Schmid was awarded one of the ERC advanced grants 2012. Congratulations!
TRECVID logo LEAR participated together with the AXES project to the TRECVID MED challenge, and finished first and second. The Multimedia Event Detection (MED) evaluation track is part of the TRECVID Evaluation. The goal of MED is to assemble core detection technologies into a system that can search multimedia recordings for user-defined events based on pre-computed metadata.
View of Grenoble The third INRIA Visual Recognition and Machine Learning Summer School took place at the INRIA Grenoble campus, from July 9 to July 13, 2012.
ECCV'12 logo Our recent publications in major computer vision conferences: 5 CVPR'12 papers, 2 ECCV'12 papers, and 1 BMVC'12 papers. See our publications page for downloads.

2011

View of Paris The second INRIA Visual Recognition and Machine Learning Summer School took place at Ecole Normale Superieure (ENS) campus, from July 25 to July 29, 2011.
ICCV'11 logo Our recent publications in major computer vision conferences: 4 CVPR'11 papers, 2 ICCV'11 papers, and 2 BMVC'11 papers. See our publications page for details.

2010

NIPS We co-organized the workshop on machine learning for next generation computer vision challenges at NIPS 2010, December 10, Whistler BC, Canada. Papers and slides of the talks are now available online.
Drinking In the PASCAL VOC 2010 our work on human action recognition achieved best results on three out of nine action classes. In the ECCV'10 International Workshop on Sign, Gesture, and Activity, our paper Human Focused Action Localization in Video was awarded the best paper prize.
ImageCLEF For the Photo Annotation task of ImageClef 2010 our joint submissions with Xerox Research Centre Europe have achieved best results on 56 of the 93 annotation concepts. For 88 concepts, our runs were among the 6 best runs out of the 63 submitted ones. See this paper for details.
View of Grenoble The first INRIA Visual Recognition and Machine Learning Summer School took place at our institute in Grenoble, from July 26 to July 30. The school had 150 international attendees. Most lecture slides are now available.
ECCV'10 logo Our recent publications in major computer vision conferences: 5 CVPR'10 papers (2 orals), 2 ECCV'10 papers, and 1 BMVC'10 paper. See our publications page for details.

2009

ImageCLEF picture For both the Photo Annotation and Image Retrieval tasks of ImageCLEF'09 Lear obtained a second place among the 19 participating teams for each task. The methods that were used are described in this paper.
ICCV'09 logo Recent publications in major computer vision conferences: 4 ICCV'09 papers (2 orals), 3 BMVC'09 papers (2 orals), and 3 CVPR'09 papers. See publications web page for details.

2008

TrecVid Lear got excellent results on Trecvid 2008. The method used is described in this paper.
Bottle detection In the PASCAL VOC 2008 Lear won the detection contest for 11 out of 20 classes (see example detections here) and the classification contest for 7 out of 20 classes.
CVPR ECCV '08 Recent publications in major computer vision conferences: 4 ECCV'08 and 4 CVPR'08 papers. See publications web page for details.
images Development of an image indexing system that searches in real time for similar images in very large databases. It is currently transferred and tested by the Start-Up MilPix. Our image search demo on 10,000,000 images: Bigimbaz.
Como Organization of an International Workshop on Object Recognition, Como, May 2008.

2007

PASCAL sheep Winner of PASCAL VOC 2007 image classification competition. LEAR's approach won the classification contest for 19 of the 20 object classes.