-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explain how it datapusher works and add API documentation #18
Comments
@rgrp The datapusher uses ckanservice provider. It is run independently of CKAN and uses the API. As for the documentation for the API, ckanserviceprovider should give you an idea; further pull requests welcome. I've updated the issue slightly to read as a bug. |
These are really good questions and really timely as we are working on the DataPusher docs before the release. As @nigelbabu mentions, the DataPusher is a standalone application (although generally installed in the same server) and all communication with CKAN core and the DataStore is done via HTTP.
Here's a simple schema of the whole process in case it helps: We'll try and improve the docs with these details. |
Also I now understand this is an instance of CKAN Service Provider and follows it docs. The actual job type is
|
For us non-core developers, it would be great to have some docs on the requests sent between datapusher and the CKAN API. It is relevant to deployment behind firewalls and proxies to understand that datapusher will send HTTP requests to the ckan.site_url, which must pass firewall and proxy. It's quite tricky to figure out why perfectly fine datapusher gets a mysterious "could not post to result_url" from a perfectly fine CKAN API. Of course this is not a problem of datapusher per se, but it's in the nature of CKAN/datapusher that they will get installed for bigger audiences, often on cloud services with weird and wonderful proxy and firewall settings. I'm happy to contribute a section on using curl to debug failing http requests between datapusher and the CKAN API if that's any good! |
+1 on documentation for this HTTP traffic-- we have been stuck for several weeks trying to figure out why datapusher and harvesting aren't working on our deployments. It's cost US A LOT of money. |
update for those stuck between their firewall and a hard place: multi-tenant setup from source (should also work for single-tenant installs) and a diagram illustrating HTTP traffic crossing the installation localhost's boundaries. Also worth reading is boxkite's setup. |
Need to add the following to the documentation:
The text was updated successfully, but these errors were encountered: