-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1 from worldbank/u/gblackadder/pydantic_and_excel
U/gblackadder/pydantic and excel
- Loading branch information
Showing
41 changed files
with
11,872 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
|
||
# Unit test | ||
.pytest_cache/ | ||
|
||
# Environments | ||
.venv | ||
|
||
# Visual Studio Code | ||
.vscode/ | ||
|
||
# Environment variables | ||
.env |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
repos: | ||
- repo: https://github.com/Yelp/detect-secrets | ||
rev: v1.5.0 | ||
hooks: | ||
- id: detect-secrets | ||
exclude: package.lock.json | ||
args: ["--exclude-lines", "\\s*\"image/png\": \".+\""] | ||
|
||
- repo: https://github.com/pre-commit/mirrors-isort | ||
rev: v5.10.1 # Use the latest version | ||
hooks: | ||
- id: isort | ||
|
||
- repo: https://github.com/charliermarsh/ruff-pre-commit | ||
rev: v0.0.287 # Use the latest version | ||
hooks: | ||
- id: ruff | ||
|
||
- repo: https://github.com/psf/black | ||
rev: 23.3.0 # Use the latest version | ||
hooks: | ||
- id: black |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,97 @@ | ||
# metadata-schemas | ||
Metadata JSON Schemas | ||
This repository contains both the definitions of Metadata Schemas and a python library for creating schema objects with pydantic and Excel. | ||
|
||
View documentation - https://worldbank.github.io/metadata-schemas/ | ||
## Defining Metadata Schemas | ||
|
||
The schemas are defined in the JSON Schema format in the folder `schemas`. For more information you can view documentation at https://worldbank.github.io/metadata-schemas/ | ||
|
||
## Excel | ||
|
||
Excel sheets formatted for each metadata type are located in this repo in the excel_sheets folder. | ||
|
||
## Python library | ||
|
||
To install the library run | ||
|
||
```pip install metadataschemas``` | ||
|
||
### Creating a pydantic metadata object | ||
|
||
To create a timeseries metadata object run | ||
|
||
```python | ||
from metadataschemas import timeseries_schema | ||
|
||
timeseries_metadata = timeseries_schema.TimeseriesSchema(idno='project_idno',series_description=timeseries_schema.SeriesDescription(idno='project_idno', name='project_name')) | ||
``` | ||
|
||
Depending on your IDE, selecting `TimeseriesSchema` could show you what fields the schema contains and their corresponding object definitions. | ||
|
||
There are metadata objects for each of the following metadata types: | ||
|
||
| Metadata Type | Metadata Object | | ||
|------------------|-------------------------------------------------| | ||
| document | `document_schema.ScriptSchemaDraft` | | ||
| geospatial | `geospatial_schema.GeospatialSchema` | | ||
| script | `script_schema.ResearchProjectSchemaDraft` | | ||
| series | `series_schema.Series` | | ||
| survey | `microdata_schema.MicrodataSchema` | | ||
| table | `table_schema.Model` | | ||
| timeseries | `timeseries_schema.TimeseriesSchema` | | ||
| timeseries_db | `timeseries_db_schema.TimeseriesDatabaseSchema` | | ||
| video | `video_schema.Model` | | ||
|
||
### Python - Metadata Manager | ||
|
||
The Manager exists to be an interface with Excel and to lightly assist creating schemas. | ||
|
||
For Excel we can: | ||
|
||
1. Create blank Excel files formatted for a given metadata type | ||
2. Write metadata objects to Excel | ||
3. Read an appropriately formatted Excel file containing metadata into a pydantic metadata object | ||
|
||
To use it run: | ||
|
||
```python | ||
from metadataschemas import MetadataManager | ||
|
||
mm = MetadataManager() | ||
|
||
filename = mm.write_metadata_outline_to_excel('timeseries') | ||
|
||
filename = mm.save_metadata_to_excel('timeseries', | ||
object=timeseries_metadata) | ||
|
||
# Then after you have updated the metadata in the Excel file | ||
|
||
updated_timeseries_metadata = mm.read_metadata_from_excel(timeseries_metadata_filename) | ||
``` | ||
|
||
Note that the Excel write and save functions do not currently support Geospatial metadata. | ||
|
||
The manager also offers a convenient way to get started creating metadata in pydantic by creating an empty pydantic object for a given metadata type which can then be updated as needed. | ||
|
||
```python | ||
# list the supported metadata types | ||
mm.metadata_type_names | ||
|
||
# get the pydantic class for a given metadata type | ||
survey_type = mm.metadata_class_from_name("survey") | ||
|
||
# create an instantiated pydantic object and then fill in your data | ||
survey_metadata = mm.type_to_outline(metadata_type="survey") | ||
survey_metadata.repositoryid = "repository id" | ||
survey_metadata.study_desc.title_statement.idno = "project_idno" | ||
``` | ||
|
||
|
||
## Updating Pydantic definitions and Excel sheets | ||
|
||
To update the pydantic schemas so that they match the latest json schemas run | ||
|
||
`python pydantic_schemas/generators/generate_pydantic_schemas.py` | ||
|
||
Then to update the Excel sheets run | ||
|
||
`python pydantic_schemas/generators/generate_excel_files.py` |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Oops, something went wrong.