Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Stop argument for islice() must be None or an integer: 0 <= x <= sys.maxsize. #285

Open
jayralencar opened this issue Mar 29, 2021 · 2 comments

Comments

@jayralencar
Copy link

jayralencar commented Mar 29, 2021

Hi,

I am getting
ValueError: Stop argument for islice() must be None or an integer: 0 <= x <= sys.maxsize.

My code:

databunch_lm = BertLMDataBunch.from_raw_corpus(
    data_dir=DATA_PATH,
    text_list = examples,
    tokenizer = loaded_tokenizer,
    batch_size_per_gpu=128,
    max_seq_length=32,
    model_type="bert",
    multi_gpu=False,
    logger=logger,
    test_size=0.01
)

Stack trace:

ValueError                                Traceback (most recent call last)
<ipython-input-22-ed19e05ac7d1> in <module>()
      9     multi_gpu=False,
     10     logger=logger,
---> 11     test_size=0.01
     12 )

3 frames
/usr/local/lib/python3.7/dist-packages/fast_bert/data_lm.py in from_raw_corpus(data_dir, text_list, tokenizer, batch_size_per_gpu, max_seq_length, multi_gpu, test_size, model_type, logger, clear_cache, no_cache)
    207             model_type=model_type,
    208             logger=logger,
--> 209             clear_cache=clear_cache,
    210             no_cache=no_cache,
    211         )

/usr/local/lib/python3.7/dist-packages/fast_bert/data_lm.py in __init__(self, data_dir, tokenizer, train_file, val_file, batch_size_per_gpu, max_seq_length, multi_gpu, model_type, logger, clear_cache, no_cache)
    279                 train_filepath,
    280                 cached_features_file,
--> 281                 self.logger,
    282                 block_size=self.tokenizer.max_len_single_sentence,
    283             )

/usr/local/lib/python3.7/dist-packages/fast_bert/data_lm.py in __init__(self, tokenizer, file_path, cache_path, logger, block_size)
    151             text = itertools.chain.from_iterable(text)
    152             text = more_itertools.chunked(text, block_size)
--> 153           self.examples = list(text)[:-1]
    154             # Note that we are loosing the last truncated example here for the sake of simplicity (no padding)

/usr/local/lib/python3.7/dist-packages/more_itertools/recipes.py in take(n, iterable)
     71 
     72     """
---> 73     return list(islice(iterable, n))
     74 
     75 

ValueError: Stop argument for islice() must be None or an integer: 0 <= x <= sys.maxsize.
@wikd13
Copy link

wikd13 commented Sep 11, 2021

same for me, have you found any way to solve it ?

@az7dev
Copy link

az7dev commented Oct 31, 2021

put int()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants