Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HashStore 1.1.0 Release #84

Merged
merged 456 commits into from
Oct 2, 2024
Merged

HashStore 1.1.0 Release #84

merged 456 commits into from
Oct 2, 2024

Conversation

doulikecookiedough
Copy link
Contributor

@doulikecookiedough doulikecookiedough commented Jan 17, 2024

Summary of Changes:

  • Major refactor of HashStore involving storing objects with their content identifiers cid and creating reference files
    • Storing an object with a pid will move the object to where it belongs, create a pid reference file that contains the cid, and a cid reference file that contains pids that reference a cid
    • Storing an object without a pid will only store the object. The caller/client must call tag_object to create the reference files (else create an orphaned object), and optionally delete_if_invalid_object should they have data they want to validate the object_metadata returned from storing an object.
  • Metadata is now stored using a object named with the hash of the pid+format_id and in a directory formed using the hash of the pid in /metadata
  • New Public API Methods
    • tag_object, delete_if_invalid_object
  • Improved thread safety and synchronization to address possible race conditions when working with data objects and added thread safety when working with metadata.
  • The hashstore config file is now separated in two pieces, with the actual .yaml content being formed with a library
  • HashStore Client Improvements
    • New script entry point after installing hashstore through poetry install
  • Revised python docstrings into reStructuredText, a format compatible with sphinx-autodocumentation along with typehints
  • New functionality to use python threads or processes via setting an environment variable
  • Revised README.md to provide context to the HashStore library and how-to sections.
  • Overall code clean up, revision to logging statements and various minor bug fixes

Greetings @artntek - When you have scope, can you please help review HashStore to assist me in releasing HashStore 1.1.0? Please do not hesitate to scrutinize any of the code or bring to my attention any concerns. Thank you 🙏!

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
src/hashstore/hashstoreclient.py Outdated Show resolved Hide resolved
src/hashstore/hashstoreclient.py Outdated Show resolved Hide resolved
tests/test_filehashstore.py Outdated Show resolved Hide resolved
src/hashstore/filehashstore.py Outdated Show resolved Hide resolved
src/hashstore/filehashstore.py Outdated Show resolved Hide resolved
Copy link
Contributor

@artntek artntek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see inline comments

@doulikecookiedough
Copy link
Contributor Author

Thanks again @artntek for your review! I've made the updates in another branch and merged them into develop. I'm also going to hold off on the 1.1.0 release until the changes to deleteObject are implemented (and merged into develop). I will ping you again then to wrap this PR up.

…rmatid is passed, causing metadata document names created via the client to be incorrect
Feature-93: Additional Logging Statements
…ct the appropriate thread or multiprocessing 'tag_object'
…or a shared list if multiprocessing is being used
… use multiprocessing lock and list when global variable set appropriately
…if syntax instead of pre-generating exist booleans
…eption and adding a new debug statement, and update pytest
… setup tmp refs files and paths outside of synchronization code
Feature-139: Path Construction & Clean Up
…n with 'yaml' library, and then join separately with a comments string to minimize yaml gotchas
Feature-138: Construct HashStore Config Yaml with Library
…to get a logger instance for the 'filehashstore' module name and revise init process
… try block, and another bug where a cid was not locked during an exception scenario
…x potential dead lock due to sync being outside try block
…hods, and optimize sync method logging calls, add missing logging statements
@doulikecookiedough doulikecookiedough merged commit c11d5bc into main Oct 2, 2024
4 checks passed
@doulikecookiedough doulikecookiedough mentioned this pull request Oct 2, 2024
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants