The postprocessing is empirically validated on a variety of lexical-level intrinsic tasks (word similarity, concept categorization, word analogy) and sentence-level tasks (semantic textural similarity and { text classification}) on multiple datasets and with a variety of representation methods and hyperparameter choices in multiple languages; in each case, the processed representations are consistently better than the original ones.

Paper
Code

How to evaluate word embeddings? On importance of data efficiency and simple supervised tasks

kudkudak/word-embeddings-benchmarks • 7 Feb 2017

Maybe the single most important goal of representation learning is making subsequent learning faster.

Paper
Code

Calculating the similarity between words and sentences using a lexical database and corpus statistics

nihitsaxena95/sentence-similarity-wordnet-sementic • 15 Feb 2018

To calculate the semantic similarity between words and sentences, the proposed method follows an edge-based approach using a lexical database.

Paper
Code

Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech

my-yy/s2v_rc • • 23 Mar 2018

In this paper, we propose a novel deep neural network architecture, Speech2Vec, for learning fixed-length vector representations of audio segments excised from a speech corpus, where the vectors contain semantic information pertaining to the underlying spoken words, and are close to other vectors in the embedding space if their corresponding underlying spoken words are semantically similar.

Paper
Code

Unsupervised Multilingual Word Embeddings

ccsasuke/umwe • • EMNLP 2018

Multilingual Word Embeddings (MWEs) represent words from multiple languages in a single distributional vector space.

Paper
Code

SemGloVe: Semantic Co-occurrences for GloVe from BERT

2024-MindSpore-1/Code2 • • 30 Dec 2020

In this paper, we propose SemGloVe, which distills semantic co-occurrences from BERT into static GloVe word embeddings.

Paper
Code

WordRank: Learning Word Embeddings via Robust Ranking

shihaoji/wordrank • EMNLP 2016

Then, based on this insight, we propose a novel framework WordRank that efficiently estimates word representations via robust ranking, in which the attention mechanism and robustness to noise are readily achieved via the DCG-like ranking losses.

Paper
Code

Definition Modeling: Learning to define word embeddings in natural language

northanapon/dict-definition • 1 Dec 2016

Distributed representations of words have been shown to capture lexical semantics, as demonstrated by their effectiveness in word similarity and analogical relation tasks.

Paper
Code

Word Similarity

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result