This package provides common functions and components for the Digital Curation Manager
-project.
This includes:
db
: database implementations and adapter definitions,models
: data-model interface and common models,orchestration
: job orchestration-system,plugins
: an interface and optional extensions for a general plugin-system,services
: various general and dcm-specific components for the definition of Flask-based web-applicationsdaemon
: background-process utility,logger
: logging and related definitions, andutil
: miscellaneous functions
Using a virtual environment is recommended.
Install this package and its dependencies form this repository by issuing pip install .
.
Alternatively, consider installing via the extra-index-url https://zivgitlab.uni-muenster.de/api/v4/projects/9020/packages/pypi/simple
with
pip install --extra-index-url https://zivgitlab.uni-muenster.de/api/v4/projects/9020/packages/pypi/simple dcm-common
This package defines optional dependencies related to flask-webservices.
These can be installed by entering pip install ".[services]"
.
The db
-subpackage imposes additional requirements.
These can be installed using pip install ".[db]"
.
The orchestration
-extra shares its additional requirements with the db
-extra due to its dependence on the db
-subpackage.
Install additional dev-dependencies with
pip install -r dev-requirements.txt
Run unit-tests with
pytest -v -s
Requires extra services
.
ALLOW_CORS
[DEFAULT 0]: have flask-app allow cross-origin-resource-sharing; needed for hosting swagger-ui with try-it functionality
ORCHESTRATION_PROCESSES
[DEFAULT 1]: maximum number of simultaneous job processesORCHESTRATION_AT_STARTUP
[DEFAULT 1]: whether orchestration-loop is automatically started with appORCHESTRATION_TOKEN_EXPIRATION
[DEFAULT 1]: whether job tokens (and their associated info like report) expireORCHESTRATION_TOKEN_DURATION
[DEFAULT 3600]: time until job token expires in secondsORCHESTRATION_DEBUG
[DEFAULT 0]: whether to have orchestrator print debug-informationORCHESTRATION_CONTROLS_API
[DEFAULT 0]: whether the orchestration-controls API is availableORCHESTRATION_QUEUE_ADAPTER
[DEFAULT "native"]: which adapter-type to use for the queueORCHESTRATION_REGISTRY_ADAPTER
: same asORCHESTRATION_QUEUE_ADAPTER
for registry-adapterORCHESTRATION_QUEUE_SETTINGS
[DEFAULT {"backend": "memory"}]: JSON object containing the relevant information for initializing the adapter- "backend": "disk" | "memory",
- kwargs expected/accepted by the selected adapter/backend (like "dir", "url", "timeout", ...; see
db
-package docs for more information)
ORCHESTRATION_REGISTRY_SETTINGS
: same asORCHESTRATION_QUEUE_SETTINGS
for registry-adapterORCHESTRATION_DAEMON_INTERVAL
[DEFAULT None]: time in seconds between each iteration of the orchestrator daemonORCHESTRATION_ORCHESTRATOR_INTERVAL
[DEFAULT None]: time in seconds between each iteration of the orchestratorORCHESTRATION_ABORT_NOTIFICATIONS
[DEFAULT 0]: whether the Notification API is used for job abortion (only relevant in parallel deployment)ORCHESTRATION_ABORT_NOTIFICATIONS_URL
[DEFAULT None]: Notification API url (only relevant in parallel deployment)ORCHESTRATION_ABORT_NOTIFICATIONS_CALLBACK
[DEFAULT None]: base-url at which abortion requests are made to from a broadcast of the Notification API (only relevant in parallel deployment)ORCHESTRATION_ABORT_TIMEOUT
[DEFAULT 1.0]: timeout duration for notify-requests to the Notification API (only relevant in parallel deployment)
In addition to the BaseConfig
-environment settings, the FSConfig
introduces the following
FS_MOUNT_POINT
[DEFAULT "/file_storage"]: Path to the working directory (typically mount point of the shared file system)
The db
-subpackage requires the extra db
(see above).
Currently, db
contains only key_value_store
-type implementations.
This is itself organized in multiple subpackages:
-
backend
: actual database implementationsmemory
: in-memory implementation without persistent datadisk
: implementation that persists its data onto disk (in a working directory)
-
middleware
: provides creation of flask-apps (factory pattern) that implements the 'LZV.nrw - KeyValueStore-API' using abackend
-componentRunning this app provides a shared database for multiple clients (ensures correct handling of concurrency). Minimal example:
from dcm_common.db import MemoryStore, key_value_store_app_factory app = key_value_store_app_factory( MemoryStore(), "db" )
-
adapter
: provides client-side access to key-value store databases regardless of native- or network-databases with a common interfacenative
: native python database (be aware that concurrent requests can lead to unexpected results)http
: network-database (like the flask-middleware provided here) that implements the 'LZV.nrw - KeyValueStore-API'
The package is designed to seamlessly support both, local in-memory testing and actual deployment of DCM-services.
- For a local test-setup, the adapter for a native database (initialized with an in-memory backend) is used.
- In case of a deployment (e.g. with horizontal scaling and persistent data), the http-adapter is used in conjunction with a deployment of the middleware (internally using the
disk
-backend implementation).
Due to the common interface for adapters, the service implementation can be agnostic regarding the database (aside from initialization).
Furthermore, this approach can be easily extended with other databases like Redis
by adding a corresponding adapter class.
- Sven Haubold
- Orestis Kazasidis
- Stephan Lenartz
- Kayhan Ogan
- Michael Rahier
- Steffen Richters-Finger
- Malte Windrath
- Roman Kudinov