MathBot is a transformer-based Math Word Problem (MWP) solver made as the Lab project for CSE 4622: Machine Learning Lab.
We have deployed the model with a simple gradio UI. Visit https://huggingface.co/spaces/Casio991ms/MathBot and check it out!
- Syed Rifat Raiyan- 180041205
- Md. Nafis Faiyaz- 180041101
- Shah Md. Jawad Kabir- 180041234
The goal of this model is to translate an MWP statement to a valid math expression, which when evaluated, yields the solution to the problem. For a better understanding of the underlying transformer model, please go through the MathBot.ipynb
file and the relevant literature that have been cited.
A Math Word Problem is a textual narrative that states a problem description and poses a question about one or more unknown quantities. These type of problems are generally found in math text-books of 1st to 3rd grade kids.
Problem:
Expression:
Our approach is to use Transformer-based
The dataset we used is MAWPS. There are 3,320 problems along with their solution expressions. Out of those, we took 2,373 problems that were specific to our interest, as the rest were geometry problems. After that we used a question generator to generate similar problems. The final dataset had 38,144 problems in total. And our train-test split was
Provide a simple Math Word Problem statement in the text-box on the left and click on the "Submit" button. After a few seconds, the model should yield a predicted math expression.
You can also click on one of the many MWP examples shown below the text-boxes.
- Training-set Accuracy →
$98.4$ % - Test-set Accuracy →
$73.7$ % - Corpus BLEU (BiLingual Evaluation Understudy) →
$87.2$ %
Let’s look at a test sample (please overlook the bad English)...
Problem:
Predicted Translation:
Here, we can see the tokens from prompt in columns and the tokens from target expressions in rows. These multiheads are somewhat similar to kernels in Convolutional Neural Networks (CNNs). We can see every single head except head
- Correctly identifies where to give attentions to figure out the expression.
- Robust to grammatical errors.
- Achieves
$73.7$ %; better than some of the works done before on this dataset.
- Trained on small dataset.
- Struggles in problems that require multiple steps and > 2 operators.
- Uses tokens of digits, not whole numbers. Output can dramatically change for only changing a number in whole problem.
- Can produce erroneous outputs if statement's grammar is slightly changed or if the given problem statement deviates too much from the structure of the problems in the training set.
We were inspired by similar research works and projects like: