- PaperHighlights
- 2019
- 03
- Not All Contexts Are Created Equal Better Word Representations with Variable Attention
- Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model
- Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet
- pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference
- Contextual Word Representations: A Contextual Introduction
- Not All Neural Embeddings are Born Equal
- High-risk learning: acquiring new word vectors from tiny data
- Learning word embeddings from dictionary definitions only
- Dependency-Based Word Embeddings
- 02
- Improving Word Embedding Compositionality using Lexicographic Definitions
- From Word Embeddings To Document Distances
- Progressive Growing of GANs for Improved Quality, Stability, and Variation
- Retrofitting Word Vectors to Semantic Lexicons
- Bag of Tricks for Image Classification with Convolutional Neural Networks
- Multi-Task Deep Neural Networks for Natural Language Understanding
- Snapshot Ensembles: Train 1, get M for free
- EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
- Counter-fitting Word Vectors to Linguistic Constraints
- AdaScale: Towards Real-time Video Object Detection Using Adaptive Scaling
- Learning semantic similarity in a continuous space
- Progressive Neural Networks
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Language Models are Unsupervised Multitask Learners
- 01
- Querying Word Embeddings for Similarity and Relatedness
- Data Distillation: Towards Omni-Supervised Learning
- A Rank-Based Similarity Metric for Word Embeddings
- Dict2vec: Learning Word Embeddings using Lexical Dictionaries
- Graph Convolutional Networks for Text Classification
- Improving Distributional Similarity with Lessons Learned from Word Embeddings
- Real-time Personalization using Embeddings for Search Ranking at Airbnb
- Glyce: Glyph-vectors for Chinese Character Representations
- Auto-Encoding Dictionary Definitions into Consistent Word Embeddings
- Distilling the Knowledge in a Neural Network
- Uncovering divergent linguistic information in word embeddings with lessons for intrinsic and extrin
- The (Too Many) Problems of Analogical Reasoning with Word Vectors
- Linear Ensembles of Word Embedding Models
- Intrinsic Evaluation of Word Vectors Fails to Predict Extrinsic Performance
- Dynamic Meta-Embeddings for Improved Sentence Representations
- 03
- 2018
- 11
- Think Globally, Embed Locally — Locally Linear Meta-embedding of Words
- Learning linear transformations between counting-based and prediction-based word embeddings
- Learning Word Meta-Embeddings by Autoencoding
- Learning Word Meta-Embeddings
- Frustratingly Easy Meta-Embedding – Computing Meta-Embeddings by Averaging Source Word Embeddings
- 6
- Universal Language Model Fine-tuning for Text Classification
- Semi-supervised sequence tagging with bidirectional language models
- Consensus Attention-based Neural Networks for Chinese Reading Comprehension
- Attention-over-Attention Neural Networks for Reading Comprehension
- Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms
- Convolutional Neural Networks for Sentence Classification
- Deep contextualized word representations
- Neural Architectures for Named Entity Recognition
- Improving Language Understanding by Generative Pre-Training
- A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence C
- Teaching Machines to Read and Comprehend
- 5
- Text Understanding with the Attention Sum Reader Network
- Effective Approaches to Attention-based Neural Machine Translation
- Distance-based Self-Attention Network for Natural Language Inference
- Deep Residual Learning for Image Recognition
- U-Net: Convolutional Networks for Biomedical Image Segmentation
- Memory Networks
- Neural Machine Translation by Jointly Learning to Align and Translate
- Convolutional Sequence to Sequence Learning
- An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
- Graph Attention Networks
- Attention is All You Need
- DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding
- A Structured Self-attentive Sentence Embedding
- Hierarchical Attention Networks for Document Classification
- Grammar as a Foreign Language
- Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
- Transforming Auto-encoders
- Self-Attention with Relative Position Representations
- 1
- 11
- 2017
- Paper Title as Note Title