Encoding Lexico-Semantic Knowledge using Ensembles of Feature Maps from Deep Convolutional Neural Networks
Semantic models derived from visual information have helped to overcome some of the limitations of solely text-based distributional semantic models. Researchers have demonstrated that text and image-based representations encode complementary semantic information, which when combined provide a more complete representation of word meaning, in particular when compared with data on human conceptual knowledge. In this work, we reveal that these vision-based representations, whilst quite effective, do not make use of all the semantic information available in the neural network that could be used to inform vector-based models of semantic representation. Instead, we build image-based meta-embeddings from computer vision models, which can incorporate information from all layers of the network, and show that they encode a richer set of semantic attributes and yield a more complete representation of human conceptual knowledge.
PDF Abstract