-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EOL Darwin Core Archive #731
base: main
Are you sure you want to change the base?
Conversation
I ran the manual test; unzipped the downloaded file, resulting in 3 well-formed files. |
@JoeCohen The first column of the multimedia.csv should match the first column of the taxa.csv file. So if you look at a line in multimedia.csv you should be able to figure out which taxon it is associated with. Similarly if you search for an id from taxa.csv you should find all the images for that taxon in multimedia.csv (as URLs of course). |
Thanks @mo-nathan. I wasn't looking far enough down the first column of the multimedia file; I saw a bunch of "1"'s and stopped looking. |
Here's the feedback from Jen Hammock @ EOL: "Looks to me like you're almost there. You do need a couple of things, mostly in your meta, a couple in your media file. The taxonID column in the taxa file should go to http://rs.tdwg.org/dwc/terms/taxonID And the first four columns in your media file should go to http://purl.org/dc/terms/identifier And the values in the http://purl.org/dc/terms/identifier column should be unique in the file. You can re-use the values from the http://rs.tdwg.org/ac/terms/accessURI column for this if you like. Finally, your http://rs.tdwg.org/dwc/terms/taxonID column should also appear in the media file, to provide the taxon mappings. Does that make sense?" I sent her another zip file based on the new code changes and will update this PR when I hear back. |
…etter optimized and refined
… some of waster memory the old way used to generate
query.project(attribute(:images, :id), | ||
attribute(:images, :when), | ||
attribute(:images, :copyright_holder), | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty line detected around arguments.
attribute(:observations, :long), | ||
attribute(:observations, :alt), | ||
attribute(:observations, :notes), | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty line detected around arguments.
attribute(:names, :text_name), | ||
attribute(:names, :author), | ||
attribute(:names, :rank), | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty line detected around arguments.
attribute(:locations, :west), | ||
attribute(:locations, :high), | ||
attribute(:locations, :low), | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty line detected around arguments.
|
||
attribute(:users, :name), | ||
attribute(:users, :login), | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty line detected around arguments.
family = extract_name(level) if level.start_with?("Family") | ||
return [family, kingdom] if family && kingdom | ||
end | ||
return ["", kingdom] if kingdom |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add empty line after guard clause.
|
||
private | ||
|
||
def parse_genus(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent indentation detected.
name.split(' ')[0] | ||
end | ||
|
||
def higher_taxa(row) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent indentation detected.
result | ||
end | ||
|
||
def parse_classification(classification) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent indentation detected.
nil | ||
end | ||
|
||
def extract_name(level) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent indentation detected.
level.split(": _")[1].chomp("_") | ||
end | ||
|
||
def genus_classification(genus) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent indentation detected.
|
||
private | ||
|
||
def parse_genus(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use 2 (not 4) spaces for indentation.
name.split(' ')[0] | ||
end | ||
|
||
def higher_taxa(row) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use 2 (not 4) spaces for indentation.
result | ||
end | ||
|
||
def parse_classification(classification) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use 2 (not 4) spaces for indentation.
nil | ||
end | ||
|
||
def extract_name(level) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use 2 (not 4) spaces for indentation.
level.split(": _")[1].chomp("_") | ||
end | ||
|
||
def genus_classification(genus) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use 2 (not 4) spaces for indentation.
private | ||
|
||
def parse_genus(name) | ||
name.split(' ')[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer double-quoted strings unless you need single quotes to avoid extra backslashes for escaping.
private | ||
|
||
def parse_genus(name) | ||
name.split(' ')[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argument ' ' is redundant because it is implied by default.
# generate CSV & meta.xml and bundle into a Zip | ||
def render | ||
filename = "#{::Rails.root}/public/dwca/eol_meta.xml" | ||
content << ["meta.xml", File.open(filename).read] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use File.read
.
# generate CSV & meta.xml and bundle into a Zip | ||
def render | ||
filename = "#{::Rails.root}/public/dwca/gbif_meta.xml" | ||
content << ["meta.xml", File.open(filename).read] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use File.read
.
locality = locality.blank? ? county : "#{county}, #{locality}" | ||
county = nil | ||
@country, @state, @county, @locality = val.split(", ", 4) | ||
if @county && [email protected]!(/ (Co\.|Parish)$/, "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a guard clause (return unless @county && [email protected]!(/ (Co\.|Parish)$/, "")
) instead of wrapping the code inside a conditional expression.
Code Climate has analyzed commit 183e8e7 and detected 16 issues on this pull request. Here's the issue category breakdown:
View more on Code Climate. |
This PR also includes changes to support the GBIF Darwin Core Archive. However, there are some issues with the resulting ZIP file.
The file can be indexed by GBIF
These sould be reviewed and cleaned up where reasonable. |
Implements the taxon based Darwin Core Archive (DwCA) that EOL expects. Involves a bit of renaming and refactoring to differentiate between the two forms of DwCA.
To test: