-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory Issues on GetWords() and crashes with given file #820
Comments
@stephen-williamson Thanks for sharing the document. The main issue I see with your document is that the page contains about 2 million letters.... Fixing that involves a deep optimisation of the layout analysis algos. The document you provided will be very usefull for benchmarking though |
after further analysis, the letter count can be brought down to 300k by only taking in account the ones that are within the boundary of the page Related to #681 |
0020.pdf
I am having a issue with a given PDF, The pdf itself is larger than most that I use pdfPig for. at round 13mb (normally my pdfs are <1mb)
It takes longer than normal to call the
GetPage()
method (about 5 seconds instead of instant) but it does succeed. While theGetWords()
method hangs for a long time (multiple minutes) before eventfully fully crashing.In that time, memory has shot right up, I end up with 1.5GB GC Heap Size and around 5GiB Allocation Rate looking at the diagnostics session in visual studio.
I cannot even catch the error with a try catch,
Any help would be great, even if it was just to be able to catch the crash nicely. I've attached a snapshot of the memory
The text was updated successfully, but these errors were encountered: