S. Banerjee and T. Pedersen, Extended gloss overlaps as a measure of semantic relatedness, IJCAI, 2003.

H. Bannour and C. Hudelot, Towards ontologies for image interpretation and annotation, 2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI), 2011.
DOI : 10.1109/CBMI.2011.5972547
URL : https://hal.archives-ouvertes.fr/hal-00825255

K. Barnard, P. Duygulu, D. Forsyth, N. De-freitas, D. M. Blei et al., Matching words and pictures, JMLR, vol.3, pp.1107-1135, 2003.

E. Bart, I. Porteous, P. Perona, and M. Welling, Unsupervised learning of visual taxonomies, 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008.
DOI : 10.1109/CVPR.2008.4587620

A. Budanitsky and G. Hirst, Evaluating WordNet-based Measures of Lexical Semantic Relatedness, Computational Linguistics, vol.17, issue.1, pp.13-47, 2006.
DOI : 10.1016/S0022-5371(79)90604-2

C. Cortes and V. Vapnik, Support-vector networks, Machine Learning, 1995.
DOI : 10.1007/BF00994018

J. Deng, A. C. Berg, K. Li, and L. Fei-fei, What Does Classifying More Than 10,000 Image Categories Tell Us?, ECCV, 2010.
DOI : 10.1007/978-3-642-15555-0_6

J. Deng, W. Dong, R. Socher, L. Li, K. Li et al., Imagenet: A large-scale hierarchical image database, CVPR, 2009.

J. Fan, Y. Gao, and H. Luo, Hierarchical classification for automatic image annotation, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '07, 2007.
DOI : 10.1145/1277741.1277763

J. Fan, H. Luo, Y. Shen, and C. Yang, Integrating visual and semantic contexts for topic network generation and word sense disambiguation, Proceeding of the ACM International Conference on Image and Video Retrieval, CIVR '09, 2009.
DOI : 10.1145/1646396.1646440

C. Fellbaum, WordNet: An Electronic Lexical Database, 1998.

G. Griffin and P. Perona, Learning and using taxonomies for fast visual categorization, 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008.
DOI : 10.1109/CVPR.2008.4587410

A. Hauptmann, R. Yan, and W. Lin, How many high-level concepts will fill the semantic gap in news video retrieval? In CIVR, 2007.

V. Lavrenko, R. Manmatha, and J. Jeon, A model for learning the semantics of pictures, NIPS, 2003.

L. Li, C. Wang, Y. Lim, D. Blei, and L. Fei-fei, Building and using a semantivisual image hierarchy, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010.
DOI : 10.1109/CVPR.2010.5540027

Y. Liu, D. Zhang, G. Lu, and W. Ma, A survey of content-based image retrieval with high-level semantics, Pattern Recognition, vol.40, issue.1, pp.262-282, 2007.
DOI : 10.1016/j.patcog.2006.04.045

D. G. Lowe, Object recognition from local scale-invariant features, Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999.
DOI : 10.1109/ICCV.1999.790410

M. Marszalek and C. Schmid, Semantic Hierarchies for Visual Object Recognition, 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007.
DOI : 10.1109/CVPR.2007.383272
URL : https://hal.archives-ouvertes.fr/inria-00548680

M. Naphade, J. R. Smith, W. Tesic, L. Hsu, A. Kennedy et al., Large-Scale Concept Ontology for Multimedia, IEEE Multimedia, vol.13, issue.3, 2006.
DOI : 10.1109/MMUL.2006.63

S. Patwardhan and T. Pedersen, Using wordnet-based context vectors to estimate the semantic relatedness of concepts, EACL, 2006.

P. Resnik, Using information content to evaluate semantic similarity in a taxonomy, IJCAI, 1995.

J. Sivic, B. C. Russell, A. Zisserman, W. T. Freeman, and A. A. Efros, Unsupervised discovery of visual object class hierarchies, 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008.
DOI : 10.1109/CVPR.2008.4587622

A. W. Smeulders, S. Member, M. Worring, S. Santini, A. Gupta et al., Content-based image retrieval at the end of the early years, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.22, issue.12, 2000.
DOI : 10.1109/34.895972

X. Wei and C. Ngo, Ontology-enriched semantic space for video search, Proceedings of the 15th international conference on Multimedia , MULTIMEDIA '07, pp.981-990, 2007.
DOI : 10.1145/1291233.1291447

L. Wu, X. Hua, N. Yu, W. Ma, and S. Li, Flickr distance, Proceeding of the 16th ACM international conference on Multimedia, MM '08, 2008.
DOI : 10.1145/1459359.1459364

B. Yao, X. Yang, L. Lin, M. W. Lee, and S. C. Zhu, I2T: Image Parsing to Text Description, Proceedings of IEEE, 2009.
DOI : 10.1109/JPROC.2010.2050411