Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding command line tool for dumping all metadata #99

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ekzhu
Copy link

@ekzhu ekzhu commented Nov 15, 2016

This address issue #98

@wardi
Copy link
Contributor

wardi commented Nov 15, 2016

dump datasets is stable because its output is always ordered by package id, does this command have a stable ordering? If not, could you look at adding one? This should also help if concurrent updates, deletes, creations are happening

@TkTech
Copy link
Member

TkTech commented Nov 15, 2016

The default sort order is 'relevance asc, metadata_modified desc', so a sort needs to be passed into the package_search call.

@wardi
Copy link
Contributor

wardi commented Nov 15, 2016

package_id asc would be nice, then we can easily compare the output from dump datasets

@ekzhu
Copy link
Author

ekzhu commented Nov 15, 2016

It looks like the metadata id field is called id instead package_id.

@davidread
Copy link
Contributor

This is excellent work.

Maybe calling it 'dump_datasets2' is a bit more specific than 'dump_metadata'?

@wardi
Copy link
Contributor

wardi commented Dec 7, 2016

yes, sorry I've been slow in merging this. I like @davidread 's command-name suggestion. dump_datasets2 is better. We should document why you might want to use this command too (accessing sites like data.gov, because it's X% faster, etc..)

@wardi
Copy link
Contributor

wardi commented Jan 20, 2017

Or even better: let's call this command search datasets and allow the parameters that are allowed to the package_search call to be provided (like you can with ckanapi action package_search ...) that makes this command much more useful and doesn't require strange naming or explanation (like "because data.gov...")

@davidread
Copy link
Contributor

Yes that would be even better, although perhaps we've messed the author around enough!

@ekzhu
Copy link
Author

ekzhu commented Jan 24, 2017

I guess dump_dataset2 is better. I am not trying to add too many functionalities here. If you call it search dataset it still overlaps withpackage_search, and more confusing. Maybe it's better to reserve search ... for non-filtering based search such as key-word search.

@wardi
Copy link
Contributor

wardi commented Jan 24, 2017

@ekzhu no worries, I'll finish this off if you're not interested in making my suggested change.

@frafra
Copy link
Contributor

frafra commented Feb 1, 2022

dump is different from package_search, as dump can download resources too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants