Skip to content

Clarification about handling of changed MH records in import API #25

@emiliom

Description

@emiliom

@BobTorgerson @brucecrevensten I'm trying to clarify a behavior in the import API. The situation and examples have arisen in the context of MountainHub data, and I'll use it to illustrate it, but it applies to all 3 providers.

We need to know what happens when the import API encounters a recent record that was previously ingested, except that there has been a change at the source since it was previously ingested. Specifically, the snowpack depth has been modified. I know the import API only queries for records with timestamps < 1 week from run time, so the edited record does meet that criteria.

The SQL INSERT statement has a condition that will reject the insertion if the record fully matches one that already exists:

INSERT INTO observations(location, id, author, depth, timestamp, source, elevation)
VALUES ${observations}
ON CONFLICT DO NOTHING

So, my interpretation is that since depth will be different, the record will be treated as new and inserted. Can you confirm?

In addition, the id is generated as a random has via the generateId function:

const crypto = require('crypto');

// Used to generate random ids
const generateId = (data) =>
  crypto.createHash('sha1').update(data).digest('base64').slice(0, 8);

This function is called in each provider script; for MH, it's here. I don't know enough Javascript to understand if crypto.createHash would create a new "random" id even if data is the same as in a previous record. Can you clarify?

Thanks!
cc @gjwolken

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions