Skip to content

Commit

Permalink
Local UI (#14)
Browse files Browse the repository at this point in the history
* WIP - prototyping local ui

* This is a test file, work in progress, committing so we don't lose it.  Hasn't been vetted, looked at etc.

* Reorganize CLI and server to be able to run against a downloaded data directory and UI directory

* Include CORS middleware

* Write out a STAC catalog with references to every contained search directory

* wip

* wip

* Use stac.json as stac names, add ui view path

* Open stac browser

* fix path

* wip ui futzing around

* Added login option

* ui tweaks

* Adding built ui to checkout

* refactored client as runnable module and add as isample entry point

* refactored client as runnable module and add as isample entry point

* Moving ui files to client package

* Add collection title and description parameters to the CLI

* Also test that the stac catalog is valid according to the stac validator

* Add a test for writing manifest file

* Test the create method

* Updating readme

* Change name for pipx install

* Add built ui to sources

* More ExportClient tests

* Move heavy libs to load on demand

* Missed this piece earlier…use the rsession we are initialized with

* Treat the query as encoded JSON

* make path to jsonl absolute; make pipx install less generic

* flake8 and mypy fixes

* flake8 again

---------

Co-authored-by: datadave <[email protected]>
  • Loading branch information
dannymandel and datadavev authored Jun 4, 2024
1 parent 65a6feb commit 7202e29
Show file tree
Hide file tree
Showing 68 changed files with 6,559 additions and 118 deletions.
95 changes: 91 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,97 @@
# Export Client
A CLI for the [iSamples Export Service](https://github.com/isamplesorg/isamples_inabox/blob/develop/docs/export_service.md).
Provides the command line client `isample` for retrieving content from the [iSamples Export Service](https://github.com/isamplesorg/isamples_inabox/blob/develop/docs/export_service.md).

## Authentication
All operations require a JWT. The process to obtain one is described in [iSamples in a Box Documentation](https://github.com/isamplesorg/isamples_inabox/blob/develop/docs/authentication_and_identifiers.md).
```
Usage: isample [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
export Export records from iSamples to a local copy.
login Open a browser to login to the iSamples site.
refresh Refresh an existing download by re-running the original query.
server Run a local web server to view exported data.
```

## Installation

The iSample client is currently under active development and the sources will be updated frequently.

The iSample client may be installed using `pipx`:

```
pipx install "git+https://github.com/isamplesorg/export_client.git"
```

or from a specific branch:

```
pipx install "git+https://github.com/isamplesorg/export_client.git@local_ui"
```

Alternatively, checkout the source from GitHub and install to a virtual environment using Poetry:

```
git clone https://github.com/isamplesorg/export_client.git
cd export_client
poetry install
poetry run isample
```


## login

```
Usage: isample login [OPTIONS]
Open a browser to login to the iSamples site.
Options:
-u, --url TEXT iSamples server URL
--help Show this message and exit.
```

All data retrieval operations require a JWT which may be retrieved using
the `isample login` command or through the process described in [iSamples in a Box Documentation](https://github.com/isamplesorg/isamples_inabox/blob/develop/docs/authentication_and_identifiers.md).

The `login` command will open a browser to the iSamples ORCID authentication page and after authentication,
presents the raw JWT which may be copied and used for export and refresh operations.

After selecting and copying the JWT to the clipboard, the JWT can be assigned to an environment variable
for convenience. For example (on OS X):

```
export TOKEN="$(pbpaste)"
```

The JWT is then available for use in the same shell as the environment variable `${JWT}`.

## export

```
Usage: isample export [OPTIONS]
Export records from iSamples to a local copy.
Options:
-t, --jwt TEXT The JWT for the authenticated user.
[required]
-u, --url TEXT The URL to the iSamples export service.
-q, --query TEXT The solr query to execute. [required]
-d, --destination TEXT The destination directory where the
downloaded content should be written.
[required]
-f, --format [jsonl|csv|geoparquet]
The format of the exported content.
--help Show this message and exit.
```

The `export` command initiates retrieval of a subset of content from the iSamples central
aggregation of physical specimen records. The subset of records is determined by a query which
is expressed in Lucene or Solr query syntax. The query may be manually crafted or retrieved
from the iSamples web UI by navigating to the subset of interest and clicking on the `Export`.

## Usage
```
Usage: export_client_cli.py [OPTIONS]
Expand Down
Binary file added example/test/isamples_export_geo.parquet
Binary file not shown.
11 changes: 11 additions & 0 deletions example/test/manifest.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
[
{
"query": "searchText:feldspar AND producedBy_resultTimeRange:[1800 TO 2024] AND -relation_target:*",
"uuid": "f40eecbe-7da0-416c-a6e4-3208eefd2bf1",
"format": "jsonl",
"start_time": "2024-05-20T12:44:41.259605Z",
"num_results": 3696,
"export_server_url": "https://central.isample.xyz/isamples_central/export/",
"is_geoparquet": true
}
]
137 changes: 137 additions & 0 deletions example/test/stac.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
{
"stac_version": "1.0.0",
"stac_extensions": [
"https://stac-extensions.github.io/table/v1.2.0/schema.json",
"https://stac-extensions.github.io/alternate-assets/v1.1.0/schema.json"
],
"type": "Collection",
"id": "iSamples Export Service result f40eecbe-7da0-416c-a6e4-3208eefd2bf1",
"title": "iSamples Stac Collection f40eecbe-7da0-416c-a6e4-3208eefd2bf1",
"license": "CC-BY-4.0",
"extent": {
"spatial": {
"bbox": [
[
-179.40000915527344,
-86.80000305175781,
179.68333435058594,
70.375
]
]
},
"temporal": {
"interval": [
[
"1905-01-01T00:00:00+00:00",
"2021-09-27T04:10:19+00:00"
]
]
}
},
"properties": {
"datetime": "2024-05-20T12:44:41.259605Z"
},
"description": "iSamples Export Service results intiated at 2024-05-20 12:44:41.259605. The solr query that produced this collection was \n```searchText:feldspar AND producedBy_resultTimeRange:[1800 TO 2024] AND -relation_target:*```. \n",
"links": [
{
"href": "./stac-item.json",
"rel": "self",
"type": "application.json",
"title": "iSample export STAC collection"
}
],
"table:columns": [
{
"name": "sample_identifier",
"description": "URI that identifies the physical sample described by this record",
"type": "string"
},
{
"name": "label",
"description": "a human intelligible string used to identify a thing, i.e. the name to use for the thing; should be unique in the scope of a sample collection or dataset.",
"type": "string"
},
{
"name": "description",
"description": "Free text description of the subject of a triple.",
"type": "string"
},
{
"name": "alternate_identifiers",
"description": "one or more identifiers used to identify the sample in other contexts. In this context, the identifier property and scheme_name should be required.",
"type": "array"
},
{
"name": "produced_by",
"description": "object that documents the sampling event--who, where, when the specimen was obtained",
"type": "string"
},
{
"name": "sampling_purpose",
"description": "term to specify why a sample was collection.",
"type": "string"
},
{
"name": "has_context_category",
"description": "Top level context, based on the kind of feature sampled. Specific identification of the sampled feature of interest is done through the SamplingEvent/Feature of Interest property. At least one value is an instance of skos:Concept from the iSamples sampledfeaturevocabulary.",
"type": "array"
},
{
"name": "has_material_category",
"description": "The kind of material that constitutes the sample. At least one value is an instance of skos:Concept from the iSamples MaterialTypeVocabulary; extension vocabularies can be used for more precise categorization.",
"type": "array"
},
{
"name": "has_specimen_category",
"description": "The kind of object the specimen is. At least one value is an instance of skos:Concept from the iSamples SpecimenTypeVocabulary; extension vocabularies can be used for more precise categorization.",
"type": "array"
},
{
"name": "keywords",
"description": "free text terms or formal categories associate with sample to support discovery. As in DataCite metadata, each keyword is a separate element. Multiple keywords should NOT be included as a comma-delimited list.",
"type": "array"
},
{
"name": "related_resource",
"description": "link to related resource with relationship property to indicate nature of connection. Target should be identifier for a resource.",
"type": "array"
},
{
"name": "complies_with",
"description": "a list of policies, recommendations, best practices (etc.) that have been followed in the collection and curation of the sample.",
"type": "array"
},
{
"name": "dc_rights",
"description": "a statement about various property rights associated with the resource, including intellectual property rights. Recommended practice is to refer to a rights statement with a URI. If this is not possible or feasible, a literal value (name, label, or short text) may be provided.",
"type": "string"
},
{
"name": "curation",
"description": "Information about the current storage of sample, access to sample, and events in curation history. Curation as used here starts when the sample is removed from its original context, and might include various processing steps for preservation. Processing related to analysis preparation such as crushing, dissolution, evaporation, filtering are considered part of the sampling method for the derived child sample.",
"type": "string"
},
{
"name": "registrant",
"description": "identification of the agent that registered the sample, with contact information. Should include person name and affiliation, or position name and affiliation, or just organization name. e-mail address is preferred contact information.",
"type": "string"
}
],
"assets": {
"1": {
"href": "./isamples_export_geo.parquet",
"type": "application/x-parquet",
"title": "iSamples Stac Collection f40eecbe-7da0-416c-a6e4-3208eefd2bf1 parquet export",
"roles": [
"data"
],
"description": "GeoParquet representation of the collection.",
"alternate": {
"view": {
"title": "View parquet file",
"href": "/ui/ds_view.html#/data/test/isamples_export_geo.parquet"
}
}
}
}
}
51 changes: 0 additions & 51 deletions export_client_cli.py

This file was deleted.

1 change: 1 addition & 0 deletions isamples_export_client/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
__version__ = '0.1.0'
Loading

0 comments on commit 7202e29

Please sign in to comment.