Conversation
|
From previous PR: This PR looks to improve the handling for converting a topology to a dataframe. This currently lives as a method for topology. It is now being moved to a convert_dataframe.py module. A few different formats are available which give some nice default ways to view a topology. Notably, we have the formats: There is also an added function that allows you to generate dataframes that cover the parameters for a set of topologies. Finally, there will be some function that prints the dataframes with the rdkit mols which are labeled to match the dataframes. TODO Checklist:
|
|
From discussion with @Vtsoch, there are a few more general use cases that would be nice to have working in the arguments for the main function, to_dataframeDict. dfDict = to_dataframeDict(ptop, parameters="sites", columns=["name", "atom_type.name", "atom_type.parameters", "charge", "molecule.name"], format="remove_duplicates")should be put as an example, since it could be hard to find the molecule info or parameters info if you don't know how the parsing works of these attributes. I would even consider these attributes to be in the dfDict = to_dataframeDict(ptop, parameters=["sites", "bonds"], columns=["name"], format="specific_columns")This will fail currently. However, I think it should just go through the columns and only grab attributes that exist, skipping the others. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #826 +/- ##
==========================================
- Coverage 94.07% 93.31% -0.77%
==========================================
Files 65 66 +1
Lines 6953 7088 +135
==========================================
+ Hits 6541 6614 +73
- Misses 412 474 +62 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
chrisjonesBSU
left a comment
There was a problem hiding this comment.
Left a couple comments.
| format: str = "default", | ||
| columns: list[str] = None, | ||
| handle_unyts: str = "to_headers", | ||
| ) -> pd.DataFrame: |
There was a problem hiding this comment.
This should be dict correct?
| ``` | ||
|
|
||
|
|
||
| >>> gmso.external.convert_dataframe.to_dataframeDict(ptop, parameters='sites', columns=['charge'], handle_unyts="to_headers") |
There was a problem hiding this comment.
@Zeerakkhan47 I think you could expand the information passed into the columns parameter here to include some of the things @CalCraven mention in his comment on June 16th.
This is a replacement PR for the features requested in #814, which has some commits in the merge history that deviated when switching to ruff linting.