-
Notifications
You must be signed in to change notification settings - Fork 13
Cleanup public interface #589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@matentzn @joeflack4 I'd appreciate a review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cthoyt Thanks for adding these aliases! Other misc changes and docs look good, too.
Just one more thing on my wishlist: method/func names and locations
with open(yml_filepath, "w") as y: | ||
yaml.safe_dump(meta, y) | ||
|
||
|
||
def write_tsv( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Method/Func names and locations
@cthoyt @matentzn One of the main ideas I had in #517 was to improve UX by mirroring pandas.
pandas.read_csv()
--> sssom.read_csv()
df.to_csv(sep='\t')
--> msdf.to_csv() # sep defaults to or can only be tabs
I'd also be willing to name or alias these as *_tsv()
instead of *_csv()
.
What do you guys think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm mildly opposed to any notion that CSV is an okay format, we should only ever use TSV
whether it's read_*
or parse_*
make no difference to me. However, in this PR I matched the existing nomenclature
wrt adding in the to_*
methods on the MSDF - that should be in a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re CSV I personally agree with you but mapping-commons/sssom#444
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unfortunate that pandas
put csv
in these methods, but they do support other separators. In any case, I'll approve!
Future method(s) would be appreciated! For now we can leave #517 open for that purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in 5d936d7 I made a more explicit parse_csv and parse_tsv that don't infer the file type (and also fixed several bugs in the file type inference code ;)).
After thinking about it more, I wouldn't want to have a pandas-like interface, since they use CSV by default and I think it's a bad idea for this to be default functionality in SSSOM-py (it'll cause everyone headaches)
This PR does the following:
parse_tsv()
andparse_csv()
as aliases forparse_sssom_table
that don't try and guess the right separator, which makes it easier to understand exactly what the function is forsssom.parse_tsv()
instead of needing to know it's hiding insssom.writers.parse_tsv
sssom.parse
function which just wrapspd.read_csv()
, and is only used by a testDemo