Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About prepro and MMI training #41

Open
liehtman opened this issue May 8, 2020 · 0 comments
Open

About prepro and MMI training #41

liehtman opened this issue May 8, 2020 · 0 comments

Comments

@liehtman
Copy link

liehtman commented May 8, 2020

I have two questions about training reversed model.
The first one is about training data. I can't see objective reason why prepro.py cuts off big part of training data. I just realized that almost all samples wich have only 1 sentencte in source are cutted of due to _make_feature function work. Mor specificif all(w == 0 for w in ws[1:]): return None. I use --reverse parameter when prepearing data.

The second question is about validation data. If we train forward model, it's obviously that we need smth like src1<eos>src2 \t tgt but how it should look when we train backward model? My assumption was tgt \t src2 <eos> src1 due to inputs = list(reversed(inputs)), but the model's performance is very poor while training, and the quality on such validation set stops increasing after very small amount of training steps.

Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant