[WIP] The Missing Ingredient in Zero-Shot Neural Machine Translation #1714

francoishernandez · 2020-01-30T17:43:55Z

This PR intends to add an implementation of the cosine similarity alignment loss introduced as a regularization term in The Missing Ingredient in Zero-Shot Neural Machine Translation.

…osine_loss

onmt/utils/loss.py

onmt/trainer.py

francoishernandez · 2020-02-07T17:38:54Z

Note:
Impact on speed is quite significant: as we need to reduce batches to make place in memory for the additional representations, we can loose up to 20-25% in training speed, both in FP32 and FP16 modes.

vince62s · 2020-02-17T17:17:13Z

For the record, we discussed offline if this should be in the code of NMTModel or Trainer.
For performance reason it needs to be in NMTModel (encoding through forward of src and tgt), but it makes the API a little less "clear". We opted for performance, but we kept the API intact when this new loss is not used.

onmt/trainer.py

francoishernandez added 9 commits January 30, 2020 18:39

add lambda_cosine, move normalization to compute_loss, adapt stats

f35b34e

fix some flake

845c989

Merge branch 'master' of https://github.com/OpenNMT/OpenNMT-py into c…

86f2c52

…osine_loss

fix forward tests

84e472f

disable sharded loss if lambda_cosine

1a6e737

fix some flake

e99daaf

move cosine loss compute to function, fix some args

1faea3c

fix flake

f8bc7f4

add arxiv link

9d26360

pltrdy reviewed Feb 7, 2020

View reviewed changes

onmt/utils/loss.py Outdated Show resolved Hide resolved

onmt/trainer.py Show resolved Hide resolved

broadcast instead of explicitly create ones

306b2e5

return encoder representations only if necessary

8616b99

francoishernandez force-pushed the cosine_loss branch from ff858ac to 8616b99 Compare February 17, 2020 17:23

roll back tests for model forward

75d645a

vince62s reviewed Feb 17, 2020

View reviewed changes

onmt/trainer.py Outdated Show resolved Hide resolved

fix typo

f53ea4d

francoishernandez force-pushed the cosine_loss branch from 9bf2856 to f53ea4d Compare February 17, 2020 18:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] The Missing Ingredient in Zero-Shot Neural Machine Translation #1714

[WIP] The Missing Ingredient in Zero-Shot Neural Machine Translation #1714

francoishernandez commented Jan 30, 2020

francoishernandez commented Feb 7, 2020 •

edited

vince62s commented Feb 17, 2020

[WIP] The Missing Ingredient in Zero-Shot Neural Machine Translation #1714

Are you sure you want to change the base?

[WIP] The Missing Ingredient in Zero-Shot Neural Machine Translation #1714

Conversation

francoishernandez commented Jan 30, 2020

francoishernandez commented Feb 7, 2020 • edited

vince62s commented Feb 17, 2020

francoishernandez commented Feb 7, 2020 •

edited