Skip to content

TensorFlow-2 implementation of Im2Latex deep learning model described in HarvardNLP paper "What You Get Is What You See: A Visual Markup Decompiler"

Notifications You must be signed in to change notification settings

harishB97/Im2Latex-TensorFlow-2

Repository files navigation

Im2Latex-TensorFlow-2


TensorFlow-2 implementation of Im2Latex deep learning model for latex code generation from images of mathematical expressions described in HarvardNLP paper "What You Get Is What You See: A Visual Markup Decompiler"

What You Get Is What You See: A Visual Markup Decompiler  
Yuntian Deng, Anssi Kanervisto, and Alexander M. Rush
http://arxiv.org/pdf/1609.04938v1.pdf

This is a general-purpose, deep learning-based system to decompile an image into presentational markup. For example, we can infer the LaTeX or HTML source from a rendered image.

Training data


Source im2latex-100k dataset has been preprocessed and resized as suitable for the model. Download the data from this link and move to "images" folder before training.

Sample results


Training and evaluating the model


Step to train and evaluate the model has been given in the notebooks "im2latex_train.ipynb" and "im2latex_test.ipynb" respectively

Model Performance


BLEU score

  1. Validation dataset (10340 images): 84.44%
  2. Test dataset (9340 images): 84.30%

Validation and train perplexity

Exact match accuracy

Preferred versions:


  • Tensorflow 2.8.0
  • Numpy 1.21.6

Previous implementations:


  1. Original implementation by HarvardNLP in Torch (Lua)
  2. TensorFlow-1 implementation

About

TensorFlow-2 implementation of Im2Latex deep learning model described in HarvardNLP paper "What You Get Is What You See: A Visual Markup Decompiler"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published