Skip to content
This repository has been archived by the owner on May 5, 2022. It is now read-only.

scripting manual source downloads #689

Open
andrewharvey opened this issue Mar 6, 2018 · 0 comments
Open

scripting manual source downloads #689

andrewharvey opened this issue Mar 6, 2018 · 0 comments

Comments

@andrewharvey
Copy link
Contributor

There are quite a few sources where you need to manually download fresh data, for these OA provides https://results.openaddresses.io/upload-cache which caches these upstream files on S3.

This is very time consuming and results in OA always lagging behind the upstream source.

What do people think about trying to automate this? I'm thinking of a Node script using https://github.com/GoogleChrome/puppeteer for each source where this is needed.

I'm happy to work on the puppeteer scripts but we'd need machine to actually run these. What do people think about this?

If not, then what do people think about a change to the https://results.openaddresses.io/upload-cache to have it produce a curl command line you can run instead of uploading files through the browser.

My workaround for slow upload speeds is to do things on a remote server which means running this script in the Console when logged into https://results.openaddresses.io/upload-cache.

function curlCommand(file) {
    var form = new FormData(document.querySelector('form[action="https://s3.amazonaws.com/data.openaddresses.io"]'));
    var curl = "curl -v -X POST"
        for (var pair of form.entries()) {
            curl += " -F '" + pair[0] + '=' + pair[1] + "'";
        }
    curl += ' https://s3.amazonaws.com/data.openaddresses.io'
    curl = curl.replace('[object File]', '@' + file);
    curl = curl.replace('${filename}', file);
    return curl;
}
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant