-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collections Docs #599
Collections Docs #599
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,200 @@ | ||
--- | ||
title: Collections Adaptor | ||
--- | ||
|
||
## Collections Overview | ||
|
||
The Collections API is a key/value storage solution. It is designed for high | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need to make clear that this is a data store in OpenFn... not something external (like all other adaptors). |
||
performance over a large volume of data. | ||
|
||
Use-cases include: | ||
|
||
- Storing mapping objects for use in workflows | ||
- Buffering and aggregating high volumes of incoming data | ||
- Caching and sharing state between workflows | ||
|
||
A Collection is bound to a project. Collections can only be accessed with a | ||
josephjclark marked this conversation as resolved.
Show resolved
Hide resolved
|
||
token associated with that project. When running on the app, a workflow is | ||
josephjclark marked this conversation as resolved.
Show resolved
Hide resolved
|
||
automatically granted access to all collections on the same project. When | ||
running in the CLI, a Personal Access Token can be used (generated from the app | ||
at /profile/tokens). | ||
|
||
## The Collections Adaptor | ||
|
||
The Collections adaptor is a special adaptor. Uniquely, the Collections adaptor | ||
is designed to be run _alongside_ other adaptors, and is injected for you by the | ||
platform. | ||
|
||
This makes the Collections API available to every step in a workflow, regardless | ||
of which adaptor it is using. | ||
|
||
## Usage Guide | ||
|
||
### Set some data in a collection | ||
|
||
The Collection API allows you to set a JSON object (or any primitive JS value) | ||
under a given key: | ||
|
||
You can also pass an array of items for a batch-set. | ||
|
||
### Getting data from a collection | ||
|
||
To retrieving multiple items from a Collection, we recommend using the `each()` | ||
function. | ||
|
||
`each()` will stream each value individually, greatly reducing the memory | ||
josephjclark marked this conversation as resolved.
Show resolved
Hide resolved
|
||
overhead of downloading a large amount of data to the client. | ||
|
||
```js | ||
each('my-collection', '2024*', (state, value, key) => { | ||
console.log(value); | ||
// No need to return state here | ||
}); | ||
``` | ||
|
||
The second argument to `each` is a query string or object. Pass a key with a | ||
pattern, or an object including different query strings. Check the API reference | ||
for a full listing. | ||
|
||
```js | ||
each( | ||
'my-collection', | ||
{ key: '2024*', created_after: '20240601' }, | ||
(state, value, key) => { | ||
console.log(value); | ||
} | ||
); | ||
``` | ||
|
||
You can limit the amount of data you want to download with the `limit` key. If | ||
there are returned values on the server, a `cursor` key will be written to | ||
`state.data`. | ||
|
||
```js | ||
each('my-collection', { key: '2024*', limit: 1000 }, (state, value, key) => { | ||
console.log(value); | ||
}).then(state => { | ||
state.nextCursor = state.data.cursor; | ||
// state.data.cursor now contains the cursor position | ||
return state; | ||
}); | ||
``` | ||
|
||
You can fetch items individually with `get()`, which will be written to | ||
state.data | ||
|
||
```js | ||
collections.get('my-collection', 'commcare-fhir-value-mappings').then(state => { | ||
state.mappings = state.data; | ||
return state; | ||
}); | ||
each($.inputs, state => { | ||
const mappedString = state.mappings[state.data.diagnosis]; | ||
state.resources ??= {}; | ||
state.resources[state.data.id] = mappedString; | ||
return state; | ||
}); | ||
``` | ||
|
||
You can also fetch multiple items with `get()`, which supports the same query | ||
options as `each()`. | ||
|
||
Bear in mind that all the items will be loaded into memory at once. For large | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm... will ping you next week to help me understand tradeoffs of using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't ping me - let me fix it here! |
||
datasets and structures, this may cause problems. | ||
|
||
When bulk-loading with `get()`, state.data will be an array of items, and | ||
`state.data.cursor` will contain the cursor position from the server | ||
|
||
```js | ||
collections.get('my-collection', '2024*').then(state => { | ||
state.allRecords = state.data; | ||
return state; | ||
}); | ||
``` | ||
|
||
### Remove data from a collection | ||
|
||
You can remove an individual item by key: | ||
|
||
```js | ||
collections.remove('my-collection', 'commcare-fhir-value-mappings'); | ||
``` | ||
|
||
You can also use the same query options as `get()` and `each()` to bulk delete: | ||
|
||
```js | ||
collections.remove('my-collection', { createdBefore: '20240601' }); | ||
``` | ||
|
||
## Collection Administration | ||
|
||
Collections must be created in the platform Admin page before they can be used. | ||
|
||
Collections can be removed from the Admin page. | ||
|
||
## CLI usage | ||
|
||
Collections are designed for close integration with the platform app, but can be | ||
used from the CLI too. | ||
|
||
You will need to: | ||
|
||
- Set the job to use two adaptors | ||
- Pass a Personal Access Token | ||
- Set the Collections endpoint | ||
|
||
You can get a Personal Access Token from any v2 deployment. | ||
|
||
Remember that a Collection must be created from the Admin page before it can be | ||
used! | ||
|
||
### For a single job | ||
|
||
You can pass multiple adaptors from the CLI: | ||
|
||
```bash | ||
openfn job.js -a collections -a http -s state.json | ||
``` | ||
|
||
You'll need to set configuration on the state.json: | ||
|
||
```json | ||
{ | ||
"configuration": { | ||
"collections_endpoint": "http://localhost:4000/collections", | ||
"collections_token": "...paste the token from the app..." | ||
} | ||
} | ||
``` | ||
|
||
### For a workflow | ||
|
||
If you're using `workflow.json`, set the token and endpoint on | ||
`workflow.credentials`: | ||
|
||
```json | ||
{ | ||
"workflow": { | ||
"steps": [ ... ], | ||
"credentials": { | ||
"collections_endpoint": "http://localhost:4000/collections", | ||
"collections_token": "...paste the token from the app..." | ||
} | ||
} | ||
} | ||
``` | ||
|
||
And make sure that any steps which use collections have multiple adaptors set: | ||
|
||
```json | ||
{ | ||
"workflow": { | ||
"steps": [ | ||
{ | ||
"expression": "...", | ||
"adaptors": ["@openfn/language-http", "@openfn/language-collections"] | ||
} | ||
] | ||
} | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if we can somehow tag
collections
andcommon
as special OpenFn adaptors?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. Collections is definitely special. Common is only sort of special 🤔
So I guess to answer this I'd have to ask what we mean by "special" and what exactly we want to flag