Skip to content

Commit

Permalink
S2025 - documentation only. (#133)
Browse files Browse the repository at this point in the history
  • Loading branch information
SharonGoliath authored Aug 5, 2020
1 parent 5e867ef commit 8886a86
Show file tree
Hide file tree
Showing 6 changed files with 336 additions and 6 deletions.
12 changes: 7 additions & 5 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,14 @@ install:
script:
- for i in $(ls -d */);
do
cd $i;
pytest --cov $i || exit -1;
if [[ $TRAVIS_PYTHON_VERSION == '3.7' ]]; then
flake8 -v $i || exit -1;
if [[ $i != "doc/" ]]; then
cd $i;
pytest --cov $i || exit -1;
if [[ $TRAVIS_PYTHON_VERSION == '3.7' ]]; then
flake8 -v $i || exit -1;
fi;
cd ..;
fi;
cd ..;
done

after_success:
Expand Down
2 changes: 1 addition & 1 deletion caom2utils/setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ license = AGPLv3
url = http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2
edit_on_github = False
github_project = opencadc/caom2tools
install_requires = cadcdata>=1.2.3 caom2>=2.4 astropy>=2.0 spherical-geometry==1.2.11;python_version=="2.7" spherical-geometry>=1.2.17;python_version>="3.4" vos>=3.0.6
install_requires = cadcdata>=1.2.3 caom2>=2.4 astropy>=2.0 spherical-geometry==1.2.11;python_version=="2.7" spherical-geometry>=1.2.17;python_version>="3.4" vos>=3.1.1
# version should be PEP386 compatible (http://www.python.org/dev/peps/pep-0386)
version = 1.4.6

Expand Down
95 changes: 95 additions & 0 deletions doc/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Working With CAOM2

For observations to appear in [CADC search services](http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/), an observation must first be described by a CAOM record. That description will then need to be loaded into the CADC CAOM repository, using a CADC web service. This web service will create a corresponding database record.

Once an Observation has been described and loaded, it is searchable from CADC's UI.

* If you are interested in using CADC Python Data Engineering tools, you should start [here](./user/cli_description.md).

* If you are interested in scripting with the CADC Python Data Engineering tools, you should start [here](./user/script_description.md).

## Preconditions

1. These descriptions assume:
1. a working knowledge of python. [Prefer python3, please](https://pythonclock.org/),
1. a linux-type environment,
1. a working directory location, where all files discussed are placed, and
1. that you have a [CADC account](http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/en/auth/request.html), which is configured by CADC to have read and write access to a CAOM `COLLECTION`.

1. This description uses the parameters `TEST_FILE.FITS`, `TEST_OBS.XML` and `COLLECTION`. Replace these values appropriately when executing the commands.

1. Copy the file `TEST_FILE.FITS` in the working directory. The metadata in this file will be described in the CAOM Observation created during this example.

1. The example will cause an instance to be created in the [CAOM2 sandbox](http://sc2.canfar.net/search/). If you click the CAOM2 sandbox URL prior to the creation of the first CAOM instance for a `COLLECTION`, that `COLLECTION` will not show in the `Additional Constraints -> Collection` . Even after successful creation of a CAOM instance, it can take up to one day for the `COLLECTION` to be selectable from the UI. The CAOM2 'sandbox' is a site that mimics the production CADC CAOM2 storage service and search UI. This sandbox site allows developers and scientists to debug collection-specific code for creating and updating CAOM2 Observations. It also allows developers and scientiests to immediately view CAOM2 records as they will appear to users in the search interface.

To use production CADC services, remove `resource-id` parameters in `caom2-repo` commands.

1. Install the following python dependencies:

```
pip install caom2repo
pip install caom2utils
```
1. Get credentials organized. The examples assume the use of a [./.netrc file](https://www.systutorials.com/docs/linux/man/5-netrc/). The examples expect this file to be named `./.netrc`, located in the working directory, and with permissions set to `-rw-------`. The `./.netrc` file content should include the following, with cadcusername and cadcpassword replaced with your CADC username and password values:
````
machine www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca login canfarusername password canfarpassword
machine www.canfar.net login canfarusername password canfarpassword
machine ws-cadc.canfar.net login canfarusername password canfarpassword
machine sc2.canfar.net login canfarusername password canfarpassword
````
To set the `./.netrc` file permissions:
```
chmod 600 ./.netrc
```
1. The caom2-repo client also supports username/password and X509 certificates. If you want to use X509
certificates use the --cert parameter instead of the -n parameter in all the commands. The command line client `cadc-get-cert` is installed with the prerequisites for the `caom2repo` package, and `cadc-get-cert --help` from a terminal prompt will describe how to obtain a CADC certificate.
1. Test the install. Commands are case-sensitive.
```
caom2-repo read --netrc ./.netrc --resource-id ivo://cadc.nrc.ca/sc2repo COLLECTION abc
```
If the install was successful, this will report an error:
```
Client Error: Not Found for url: http://sc2.canfar.net/sc2repo/auth-observations/COLLECTION/abc.
```
## Troubleshooting
1. If `pip install caom2utils` fails with the following error:
```
AttributeError: module ‘enum’ has no attribute ‘IntFlag’
```
Ensure the version of vos is >= 3.1.1:
```
pip list | grep vos
```
Upgrade vos if necessary:
```
pip install --upgrade vos
```
Uninstall `enum34`, the package raisng the AttributeError:
```
pip uninstall enum34
```
Then retry the `caom2utils` install:
```
pip install caom2utils
```
1 change: 1 addition & 0 deletions doc/developer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
This is the developer documentation for the CADC Python Data Engineering Tools.
46 changes: 46 additions & 0 deletions doc/user/cli_description.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# How to describe and load a CAOM2 Observation using the Command Line

1. Ensure the pre-conditions as described [here](https://github.com/opencadc/caom2tools/blob/master/doc#preconditions)

1. Use the file [test_obs.blueprint](https://github.com/opencadc-metadata-curation/collection2caom2/blob/master/test_obs.blueprint) as the initial version of the blueprint file. For more information on the concept of blueprints, and their use, see [here](https://github.com/opencadc/caom2tools/blob/master/doc/user/script_description.md#observation-blueprints).

1. Run caom2gen. The value provided for the `--local` parameter must be a fully qualified path name.

```
caom2gen --out TEST_OBS.XML --observation COLLECTION TEST_OBS --blueprint ./test_obs.blueprint
--local /fully/qualified/path/TEST_FILE.FITS --lineage test_file/ad:COLLECTION/TEST_FILE.FITS
```
1. There should be a file named `TEST_OBS.XML` in the working directory.
1. Run caom2-repo. There will be no output if the command succeeds.
```
caom2-repo create --netrc ./.netrc --resource-id ivo://cadc.nrc.ca/sc2repo TEST_OBS.XML
```
1. Everything after this is making refinements to the mapping between file content and CAOM2 instance members. This means issuing `caom2-repo update` commands, instead of `caom2-repo create` commands, to make changes on the server. However, caom2-repo is particular about its ids, so after the first successful execution of caom2-repo create, do this:
```
caom2-repo read --netrc ./.netrc --resource-id ivo://cadc.nrc.ca/sc2repo COLLECTION TEST_OBS > TEST_OBS_READ.XML
```
1. There should be a file named `TEST_OBS_READ.XML` on disk. It will be different than the `TEST_OBS.XML` used for `create`, because the service generates a parallel set of keys that must be honoured. For each of the observation, plane, artifact, part, and chunk elements there are `id`, `lastModified`, `maxLastModified`, `metaChecksum`, and `accMetaChecksum` values. In particular, the `id` values must be consistent when doing `caom2-repo update` calls, or a "This observation already exists" error will occur.
1. After you've generated this output file, use the following commands to iteratively make and view changes to the mapping between the `COLLECTION` data and the CAOM2 instance:
```
caom2gen -o TEST_OBS.XML --in TEST_OBS_READ.XML --blueprint ./test_obs.blueprint --local /fully/qualified/path/TEST_FILE.FITS
--lineage TEST_FILE/ad:COLLECTION/TEST_FILE.FITS
caom2-repo update --netrc ./.netrc --resource-id ivo://cadc.nrc.ca/sc2repo TEST_OBS.XML
```
1. In your browser, go to http://sc2.canfar.net/search, enter `TEST_OBS` into the `Observation ID` search field, click search, then click the `TEST_OBS` link in the `Obs. ID` column of the `Results` tab. This will display the details of the CAOM2 instance for `TEST_OBS` in a new tab.
1. Modify the blueprint to change mappings between the `COLLECTION` data model and the CAOM2 data model. If more complicated metadata mappings are required, investigate the use of the `--module` and `--plugin` parameters to [caom2gen](https://github.com/opencadc/caom2tools/tree/master/caom2utils). There are additional `caom2gen` parameters described here as well.
1. Should entries ever need to be deleted from the CAOM2 repository, replace `COLLECTION` with the appropriate value, and replace `TEST_OBS` with the observation ID that is being deleted:
```
caom2-repo delete --netrc ./.netrc --resource-id ivo://cadc.nrc.ca/sc2repo COLLECTION TEST_OBS
```
186 changes: 186 additions & 0 deletions doc/user/script_description.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
# How to describe and load a CAOM2 Observation using Python scripts

Ensure the pre-conditions described [here](../README.md).

The method `caom2utils.fits2caom2.augment` uses the concept of a blueprint to capture the description of a CAOM2 Observation as a
mapping of a Telescope Data Model (TDM) to the CAOM2 data model. This describes how to extend that application to customize the mapping for a `COLLECTION`.

`augment` works by creating or augmenting a CAOM2 Observation record, which can then be loaded via the CADC service.

`augment` creates the Observation record using information contained in a FITS file. The python module `fits2caom2`, from the python package `caom2utils`,
examines the FITS file and uses a blueprint, embodied in an instance of the ObsBlueprint class, to define default values, override values, and mappings to augment the FITS header. The keywords and values in the augmented FITS header are then used to fill in corresponding CAOM2 entities and attributes.

There are two alternate ways to provide input file metadata to the caom2gen application:
* have the file located on disk, and use the --local parameter
* have the file located in a CADC archive. The artifact URI portion of the lineage parameter will be used to resolve the archive and file name.

## Observation Blueprints

The blueprint is one way to capture the mapping of the TDM to the CAOM2 data model. The blueprint can identify:
* what information to obtain from the FITS header,
* defaults in case the FITS header is incomplete,
* hard-coded value when the FITS header should be ignored, or doesn't have information, and
* python functions which will be loaded and executed at run-time to augment FITS keyword values. See [this section](https://github.com/opencadc/caom2tools/blob/master/doc/user/script_description.md#putting-it-all-together) for an example.

The blueprint is a set of key-value pairs, where the values have three possible representations.

The three representations are: defaults, overrides, and FITS keyword mappings.

There is a sample blueprint in [this file](https://github.com/opencadc-metadata-curation/collection2caom2/blob/master/test_obs.blueprint).

The keys are the long-form names for the CAOM2 model elements and attributes. The complete set of valid keys can be found by executing the following:

pydoc caom2utils.fits2caom2.ObsBlueprint

### Changing What a Blueprint Looks Like, By Extension

A blueprint may be provided by one of two ways: as a file on disk, or programmatically.

#### File Blueprint Usage

Observation.observationID = ['OBSID'], default = TEST_OBS
Plane.dataRelease = 2017-08-31T00:00:00
Chunk.position.coordsys = ['RADECSYS,RADESYS']

* Observation.observationID provides a default value of `TEST_OBS`, which is used if the `OBSID` keyword does not exist in the FITS file.
* Plane.dataRelease provides an override value, which is always used.
* Chunk.position.coordsys provides a list of FITS keywords to try. If the first value is not in the FITS header, the second one is queried. If neither of them exist, there will be no value for Chunk.position.coordsys in the CAOM2 observation.

#### Programmatic Blueprint Usage

An example of this implementation is in (https://github.com/opencadc-metadata-curation/vlass2caom2)

bp = ObsBlueprint(position_axes=(1,2), time_axis=3, energy_axis=4, polarization_axis=5, observable_axis=6)
bp.set_default('Observation.observationID', 'TEST_OBS')
bp.set('Plane.dataRelease', '2017-08-31T00:00:00')
bp.add_fits_attribute('Chunk.position.coordsys', 'RADECSYS')
bp.add_fits_attribute('Chunk.position.coordsys', 'RADESYS')

* Observation.observationID provides a default value of `TEST_OBS`, which is used if the `OBSID` keyword does not exist in the FITS file.
* Plane.dataRelease provides an override value, which is always used when setting the plane-level data release date in the CAOM2 instance.
* Chunk.position.coordsys provides a list of FITS keywords to try. The last keyword listed will be tried first, and the first keyword found will be used to set the value.

To make WCS content available in the blueprint, instead of setting the indices in the ObsBlueprint constructor any of the following functions for which there is metadata in a FITS file may be called on a blueprint instance:

bp = ObsBlueprint()
bp.configure_position_axes((1, 2))
bp.configure_energy_axis(3)
bp.configure_time_axis(4)
bp.configure_polarization_axis(5)
bp.configure_observable_axis(6)
bp.configure_custom_axis(7)

## Putting It All Together

The following script is an end-to-end example of describing and loading a CAOM2 Observation to the CADC service, given a FITS file and programatically constructing a blueprint.

import importlib
import os
from cadcutils import net
from caom2 import obs_reader_writer, DataProductType, CalibrationLevel
from caom2repo import CAOM2RepoClient
from caom2utils import fits2caom2


def get_meta_release(header):
"""
Use functions when the value of many header keywords are needed
to set one CAOM2 attribute.
"""
obs_type = header.get('OBSTYPE')
if obs_type == 'OBJECT':
# science observation
rel_date = header.get('REL_DATE')
else:
# calibration observation
rel_date = header.get('DATE-OBS')
return rel_date


# configure and create the CADC service client
this_dir = os.path.dirname(os.path.realpath(__file__))
netrc_fqn = f'{this_dir}/netrc'
subject = net.Subject(netrc=netrc_fqn)
# remove the resource_id parameter to use production resources
repo_client = CAOM2RepoClient(subject, resource_id='ivo://cadc.nrc.ca/sc2repo')

# describe the Observation by setting up the mapping between the
# COLLECTION and CAOM2, which is captured in an instance of
# ObsBlueprint

# so functions can be used in the blueprint
module = importlib.import_module(__name__)

bp = fits2caom2.ObsBlueprint(module=module)
bp.configure_position_axes((1, 2))
# set a default value that will be used if FITS header values are not
# available
bp.set_default('Observation.observationID', 'TEST_OBS')
# set a hard-coded value
bp.set('Plane.dataRelease', '2017-08-31T00:00:00')
# use an enumerated value for a hard-coded value
bp.set('Plane.calibrationLevel', CalibrationLevel.RAW_STANDARD)
bp.set('Plane.dataProductType', DataProductType.IMAGE)
# add the FITS keyword 'RADECSYS' to the list of FITS keywords
# checked for a value
bp.add_fits_attribute('Chunk.position.coordsys', 'RADECSYS')
# execute a function to set a value - parameter may be either
# 'header' or 'uri'
bp.set('Plane.metaRelease', 'get_meta_release(header)')

# apply the mapping to the FITS file, which writes the Observation to
# an xml file on disk
kwargs = {}
uri = 'ad:COLLECTION/TEST_FILE.FITS'
blueprints = {uri: bp}
fits2caom2.augment(blueprints=blueprints,
no_validate=False,
dump_config=False,
plugin=None,
out_obs_xml='./TEST_OBS.XML',
in_obs_xml=None,
collection='COLLECTION',
observation='TEST_OBS',
product_id='TEST_PRODUCT_ID',
uri=uri,
netrc=netrc_fqn,
file_name='file:///test_files/TEST_FILE.FITS',
verbose=False,
debug=True,
quiet=False,
caom_namespace=obs_reader_writer.CAOM23_NAMESPACE,
**kwargs)

# load the observation into memory
reader = obs_reader_writer.ObservationReader(False)
observation = reader.read('./TEST_OBS.XML')

# create the observation record with the service
#
# use 'update' if the observation has already been loaded to the CADC service.
# The service generates a parallel set of database keys that must be honoured.
# The id values must be consistent when doing 'create' and 'update' calls, or
# a "This observation already exists" error will occur.
#
# existing_obs = repo_client.read('COLLECTION', 'TEST_OBS_ID')
# writer = obs_reader_writer.ObservationWriter()
# writer.write(existing_obs, '/fully/qualified/EXISTING.XML')
# fits2caom2.augment(...
# in_obs_xml='/fully/qualified/EXISTING.XML',
# ...)
# load the observation into memory
# then use:
# repo_client.update(observation)
repo_client.create(observation)

## More Information

If you want:

* to add direct CAOM2 model manipulation to your script, see [here](https://github.com/opencadc/caom2tools/tree/master/caom2) for an introduction to the possibilities.

* a description of the latest version of the CAOM2 model, see [here](http://www.opencadc.org/caom2/).

* a description of the operational version of the CADC Archive Metadata Service, see [here](http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/ams/).

* examples of the model and the client embedded in end-to-end workflows, see [here](https://github.com/opencadc-metadata-curation). Each application in this repository uses the tactic of programatically creating a unique blueprint for each file that is ingested, and then creating or updating the resulting CAOM2 Observation.

0 comments on commit 8886a86

Please sign in to comment.