Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve memory leaks caused by adding and commiting to postgres (related to #494) #541

Closed
wants to merge 10 commits into from
Closed

Conversation

YasushiMiyata
Copy link
Contributor

@YasushiMiyata YasushiMiyata commented May 7, 2021

Description of the problems or issues

Is your pull request related to a problem? Please describe.
Memory leaks happen when Fonduer adds and commits data to postgres after parsing many documents.
This causes OOM killer of Ubuntu and hangup Fonduer process. Please refer to #494.

Does your pull request fix any issue.
One of the memory leak problems is because sqlalchemy keeps all data, which are parsed from documents, on memory untill commit from add.

Description of the proposed changes

Add commit to immediately after add to free memory (L136 of src/fonduer/parser/parser.py).
I’m definitely sure that no side effect happens because no processes exist with the data on the memory after add and commit.
By the way, there are two cases doing commit after multiple add; one is preparing a rollback of inserting data to postgres, and the other is accelerating insert processes for many small data. In contrast, document data in Fonduer are a few large data.

Test plan

Run an existing tests (No additional tests).

Checklist

  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • I have updated the CHANGELOG.rst accordingly.

@lukehsiao
Copy link
Contributor

Please rebase this off master.

@YasushiMiyata
Copy link
Contributor Author

So this PR failes rebase, I close and create new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants