Skip to content

Supports a single YAML file hierarchical catalog to organize datasets and avoid a data swamp.

License

Notifications You must be signed in to change notification settings

zillow/intake-nested-yaml-catalog

Repository files navigation

https://travis-ci.org/zillow/intake-nested-yaml-catalog.svg?branch=master https://coveralls.io/repos/github/zillow/intake-nested-yaml-catalog/badge.svg?branch=master Documentation Status

Welcome to Intake plugin for nested YAML catalogs

This is an Intake plugin supporting a single YAML hierarchical catalog to organize datasets and avoid a data swamp.

Example of organizing the datasets by business domain entities:

metadata:
  hierarchical_catalog: true
entity:
  customer:
    customer_attributes:
      args:
        urlpath: s3://foo
      driver: parquet
  user:
    user_profile:
      args:
        urlpath: s3://foo
      driver: parquet

Can be accessed as:

df = catalog.entity.customer.customer_attributes.read()

About

Supports a single YAML file hierarchical catalog to organize datasets and avoid a data swamp.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages