chapters 6,7,8,10 http://www.deeplearningbook.org/ The original sequence to sequence paper https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf

One of the original attention paper https://arxiv.org/pdf/1409.0473.pdf

For the class last spring when we covered attention, I used https://talbaumel.github.io/attention/ for figures and simplified explanation.

I also pointed everyone to Chris Olah's blog https://distill.pub/2016/augmented-rnns/

Although to mention how they work, I think I gained the most insight into this from the pointer networks paper https://arxiv.org/pdf/1506.03134.pdf. For some reason, I had really thought of attention closer to the pointer network.

8.4 - 8.6 https://web.stanford.edu/~jurafsky/slp3/8.pdf

(1/22)

http://www.deeplearningbook.org/contents/rnn.html 10.4-10.6 10.10

http://www.statmt.org/mtma15/uploads/mtma15-neural-mt.pdf

https://arxiv.org/pdf/1703.01619.pdf

https://www.aclweb.org/anthology/D14-1179

https://web.stanford.edu/~jurafsky/slp3/8.pdf

http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf

(1/30)

https://arxiv.org/pdf/1703.01619.pdf

https://web.stanford.edu/~jurafsky/slp3/8.pdf

(2/14)

https://arxiv.org/pdf/1503.00075.pdf

https://github.com/dasguptar/treelstm.pytorch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readings.md

readings.md

(1/22)

(1/30)

(2/14)

Files

readings.md

Latest commit

History

readings.md

File metadata and controls

(1/22)

(1/30)

(2/14)