Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detect duplicate records for pd types #1469

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

RabiaSajjad
Copy link
Member

No description provided.

@JVickery-TBS
Copy link
Contributor

@RabiaSajjad do you think that this might be a thing needed in the future for checking duplicate PD records when merging organizations? Or do you think this is just a very one time thing?

If we could use it in the future, it might just be easier making this into a ckan command in our Canada plugin. That way we can get all the information from the PD chromos (yaml files), and would not need to download the large CSV files as they would already be on the server in the backups directory.

Otherwise this one off script would be fine. Instead of doing pandas for this, you could try using the built-in csv library (or unicodecsv if python2), that way we would not need the extra dependency just for this one script.

@wardi
Copy link
Member

wardi commented Apr 26, 2024

pandas is good stuff, though. No reason to use worse tools to avoid adding something to dev-requirements.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants