Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding annotated training dataset #368

Open
mathieudumayet opened this issue Oct 13, 2020 · 0 comments
Open

Adding annotated training dataset #368

mathieudumayet opened this issue Oct 13, 2020 · 0 comments

Comments

@mathieudumayet
Copy link

Hi,

in a blog you wrote:

You can also improve the performance of the pre-trained Reader, which was pre-trained on SQuAD 1.1 dataset. If you have an annotated dataset (that can be generated by the help of the cdQA-annotator) in the same format as SQuAD dataset you can fine-tune the reader on it:

# Put the path to your json file in SQuAD format here
path_to_data = './data/SQuAD_1.1/train-v1.1.json'
cdqa_pipeline.fit_reader(path_to_data)

Should we add our own training dataset that we constructed via cdQA-annotator into the "train-v1.1.json" file by editing it?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant