You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Kindly consider changing the def _expand_paragraphs function in the cdqa_sklearn.py file to accommodate larger datasets. Modifying the dataframe needs a lot of memory for bigger data so it would be better to set it as a list of dict before making it a dataframe.
Below is the modification I did so I would not get a MemoryError:
@staticmethod
def _expand_paragraphs(df):
data=[]
for n in range(len(df)):
stringlist = df.iloc[n][1]
for m in range(len(stringlist)):
a=df.iloc[n][0]
b=stringlist[m]
data.append({'title' : a, 'content' : b})
dfx = pd.DataFrame(data)
return dfx
The text was updated successfully, but these errors were encountered:
Very good point. +1 @nortz8
However, your workaround did not work for me. I ended up having the following; ValueError: empty vocabulary; perhaps the documents only contain stop words
Kindly consider changing the def _expand_paragraphs function in the cdqa_sklearn.py file to accommodate larger datasets. Modifying the dataframe needs a lot of memory for bigger data so it would be better to set it as a list of dict before making it a dataframe.
Below is the modification I did so I would not get a MemoryError:
The text was updated successfully, but these errors were encountered: