Skip to content
This repository has been archived by the owner on Aug 14, 2019. It is now read-only.

RuntimeError: Unrecognized line format #71

Open
p-null opened this issue Jun 28, 2018 · 2 comments
Open

RuntimeError: Unrecognized line format #71

p-null opened this issue Jun 28, 2018 · 2 comments

Comments

@p-null
Copy link
Contributor

p-null commented Jun 28, 2018

Hi, i am running the biMPM model to predict, getting the following result:

Traceback (most recent call last):
  File "run_bimpm.py", line 267, in <module>
 66%|███████████████████████████████████████████████████████████████▊                                 | 2345735/3563475 [09:22<04:51, 4172.73it/s]    main()
  File "run_bimpm.py", line 160, in main
    mode="word+character")
  File "../../duplicate_questions/data/data_manager.py", line 390, in get_test_data_from_file
    self.instance_type)
  File "../../duplicate_questions/data/dataset.py", line 145, in read_from_file
    return TextDataset.read_from_lines(lines, instance_class)
  File "../../duplicate_questions/data/dataset.py", line 177, in read_from_lines
    instances = [instance_class.read_from_line(line) for line in tqdm(lines)]
  File "../../duplicate_questions/data/dataset.py", line 177, in <listcomp>
    instances = [instance_class.read_from_line(line) for line in tqdm(lines)]
  File "../../duplicate_questions/data/instances/sts_instance.py", line 118, in read_from_line
    raise RuntimeError("Unrecognized line format: " + line)
RuntimeError: Unrecognized line format: "life in dublin?"""

Now the temporary workout is i delete the else branch, so it will skip unrecognized line

@nelson-liu
Copy link
Owner

yeah, that isn't a proper NLI instance, right? It expects [id],[question1],[question2].

Skipping is an acceptable workaround, but I think the better solution would be to reformat the data you use :)

@p-null
Copy link
Contributor Author

p-null commented Jun 29, 2018

I use quora question dataset from kaggle, which is the same as yours.
I find that line in the test_final.csv
"2162206","What is the minimum salary needed to live a decent life in Malaysia?","What is the minimum salary needed to live a decent life in dublin?"
It is proper instance .
I am so confused since this instance and code both are right.
By the way, i shouldn't delete otherwise the kaggle won't score it because the number of rows is not proper.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants