Highlights from CoNLL and EMNLP 2019 December 03, 2019

CoNLL and EMNLP, two top-tier natural language processing conferences, were held in Hong Kong last month. A large contingent of the Square AI team, myself included, attended and our fantastic intern, Justin Dieter, presented our work on a new contextual language generation task: mimic rephrasals. Despite being a regular conference attendee, I was surprised by the sheer quantity and quality of innovative ideas presented at the conference: a true testament to how fast the field is moving. It’s impossible to cover everything that happened, but in this post I’ve tried to capture a sampling of the ideas I found most exciting in the sessions I attended.


Here’s an outline of the selection of ideas covered in this post:


This section covers the methodological advances I found interesting – these are techniques that I think could be broadly applicable and are worth adding to your tool box.

Significance testing done right when searching over hyperparameters

Show Your Work: Improved Reporting of Experimental Results arxiv
Jesse Dodge, Suchin Gururangan, Dallas Card, Roy Schwartz, Noah A. Smith

Unlearning dataset bias by fitting the residual

Unlearn Dataset Bias in Natural Language Inference by Fitting the Residual arxiv
He He, Sheng Zha, Haohan Wang

Bidirectional sequence generation

Attending to Future Tokens for Bidirectional Sequence Generation pdf
Carolin Lawrence, Bhushan Kotnis and Mathias Niepert

A promising approach to evaluate content relevance for summarization: automated pyramid scores

Automated Pyramid Summarization Evaluation pdf
Yanjun Gao, Chen Sun, Rebecca J. Passonneau

A general-purpose algorithm for constrained sequential inference

A General-Purpose Algorithm for Constrained Sequential Inference pdf
Daniel Deutsch, Shyam Upadhyay, Dan Roth

Using generalized CCA to combine embeddings for unsupervised duplicate question detection

Multi-View Domain Adapted Sentence Embeddings for Low-Resource Unsupervised Duplicate Question Detection pdf
Nina Poerner, Hinrich Schütze

Predicting performance drop under domain shift

To Annotate or Not? Predicting Performance Drop under Domain Shift pdf
Hady Elsahar, Matthias Gallé

A sparsity regularizer that’s differentiable in its sparseness

Adaptively Sparse Transformers pdf
Gonçalo M. Correia, Vlad Niculae, André F. T. Martins

Reducing complex question answering tasks to simpler ones by generating answering templates

A Discrete Hard EM Approach for Weakly Supervised Question Answering pdf
Sewon Min, Danqi Chen, Hannaneh Hajishirzi, Luke Zettlemoyer

Datasets / Tasks

This section references some datasets that I thought were particularly exciting. I was surprised how few things made it to this list, though that’s probably a reflection of which sessions I was sitting in on.

A more natural fact-checking corpus based on Snopes

A Richly Annotated Corpus for Different Tasks in Automated Fact-Checking pdf
Andreas Hanselowski, Christian Stab, Claudia Schulz, Zile Li, Iryna Gurevych

Detecting framing in news stories

Detecting Frames in News Headlines and Its Application to Analyzing News Framing Trends Surrounding U.S. Gun Violence pdf
Siyi Liu, Lei Guo, Kate Mays, Margrit Betke, Derry Tanti Wijaya

Zero-shot entity linking from entity descriptions

Zero-Shot Entity Linking by Reading Entity Descriptions pdf
Lajanugen Logeswaran, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, Jacob Devlin, Honglak Lee


In this section, I’d like to highlight results that I think transcend the particular model or dataset used and are worth keeping around in the back of my head.

Learning a document retriever using paragraph vectors outperforms IR when you don’t know what to query

Latent Retrieval for Weakly Supervised Open Domain Question Answering (ACL 2019) pdf
Kenton Lee, Ming-Wei Chang, Kristina Toutanova

Using distant supervision to pretrain embeddings can improve relation extraction performance

Kristina Toutanova

Language models internalize disturbing biases that can be triggered innocuously

Universal Adversarial Triggers for Attacking and Analyzing NLP pdf
Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh

Attention might actually be a valid form of explanation

Attention is not not Explanation pdf
Sarah Wiegreffe, Yuval Pinter

For (non-English) languages with grammatical gender, be wary of its noun representations

How Does Grammatical Gender Affect Noun Representations in Gender-Marking Languages? pdf
Hila Gonen, Yova Kementchedjhieva, Yoav Goldberg

Active learning ties the collected data to the model posing an obstacle to deploying it in practice

Practical Obstacles to Deploying Active Learning pdf
David Lowell, Zachary C. Lipton, Byron C. Wallace

When using probes to study a model’s representations, make sure you control for the probe itself!

Designing and Interpreting Probes with Control Tasks pdf
John Hewitt, Percy Liang

Special thanks to Aniruddh Raghu and Maithra Raghu for feedback on earlier drafts of this post.