Error running the script #4

eperezv · 2020-10-19T15:06:09Z

Hello,

I am trying to run the script but I got an error during the process. I am just running this "python3 analyze_papers.py My_Library_v2.csv" on Ubuntu 20.04 after installing the dependencies (i.e. pdfminer and so).

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "analyze_papers.py", line 156, in article_worker
    pdf_result, text, pdf_log = process_pdf(metadata)
  File "analyze_papers.py", line 115, in process_pdf
    original_page_count, pages = pdf_to_text_list(first_pdf)
  File "analyze_papers.py", line 35, in pdf_to_text_list
    pages = layout_scanner.get_pages(file_loc, images_folder=None)  # you can try os.path.abspath("output/imgs")
  File "/home/eduardo/Downloads/zotero_export/layout_scanner.py", line 214, in get_pages
    return with_pdf(pdf_doc, _parse_pages, pdf_pwd, *tuple([images_folder]))
  File "/home/eduardo/Downloads/zotero_export/layout_scanner.py", line 37, in with_pdf
    result = fn(doc, *args)
  File "/home/eduardo/Downloads/zotero_export/layout_scanner.py", line 204, in _parse_pages
    interpreter.process_page(page)
  File "/usr/lib/python3/dist-packages/pdfminer/pdfinterp.py", line 852, in process_page
    self.render_contents(page.resources, page.contents, ctm=ctm)
  File "/usr/lib/python3/dist-packages/pdfminer/pdfinterp.py", line 864, in render_contents
    self.execute(list_value(streams))
  File "/usr/lib/python3/dist-packages/pdfminer/pdfinterp.py", line 888, in execute
    func(*args)
  File "/usr/lib/python3/dist-packages/pdfminer/pdfinterp.py", line 772, in do_TJ
    self.device.render_string(self.textstate, seq, self.ncs, self.graphicstate.copy())
  File "/usr/lib/python3/dist-packages/pdfminer/pdfdevice.py", line 85, in render_string
    textstate.linematrix = self.render_string_horizontal(
  File "/usr/lib/python3/dist-packages/pdfminer/pdfdevice.py", line 100, in render_string_horizontal
    for cid in font.decode(obj):
  File "/usr/lib/python3/dist-packages/pdfminer/pdffont.py", line 511, in decode
    return bytearray(bytes)  # map(ord, bytes)
TypeError: cannot convert 'PSKeyword' object to bytearray
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "analyze_papers.py", line 242, in <module>
    result = pool.map(list_worker, list(titles_dict.items()), chunksize=5)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
TypeError: cannot convert 'PSKeyword' object to bytearray

The text was updated successfully, but these errors were encountered:

andreacatta · 2021-02-02T12:39:08Z

same thing here on the same OS version:

andrea@WS:~/Apps/citation_map-master$ python3 analyze_papers.py exported.csv Traceback (most recent call last): File "analyze_papers.py", line 233, in <module> titles_dict = read_titles(args.zotero_csv) File "analyze_papers.py", line 81, in read_titles 'file': r[FILE_I], IndexError: list index out of range

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error running the script #4

Error running the script #4

eperezv commented Oct 19, 2020

andreacatta commented Feb 2, 2021

Error running the script #4

Error running the script #4

Comments

eperezv commented Oct 19, 2020

andreacatta commented Feb 2, 2021