New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Adding annotated training dataset #368

Open

mathieudumayet opened this issue Oct 13, 2020 · 0 comments

mathieudumayet commented Oct 13, 2020

Hi,

in a blog you wrote:

You can also improve the performance of the pre-trained Reader, which was pre-trained on SQuAD 1.1 dataset. If you have an annotated dataset (that can be generated by the help of the cdQA-annotator) in the same format as SQuAD dataset you can fine-tune the reader on it:

# Put the path to your json file in SQuAD format here
path_to_data = './data/SQuAD_1.1/train-v1.1.json'
cdqa_pipeline.fit_reader(path_to_data)

Should we add our own training dataset that we constructed via cdQA-annotator into the "train-v1.1.json" file by editing it?

Thanks!

The text was updated successfully, but these errors were encountered:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment