Datakit is a pluggable command-line tool for managing the life cycle of data projects.
The Associated Press Data Team uses Datakit to auto-generate project skeletons, archive and share data on Amazon S3, and other routine tasks.
Datakit is a thin wrapper around the Cliff command-line framework and is intended for use with a growing ecosystem of plugins.
Feel free to use our plugins on Github, or fork and modify them to suit your needs.
If you're comfortable programming in Python, you can create your own plugins (see :ref:`creating-plugins`).
For a system-wide install, from the command line:
$ sudo pip install datakit-core
After installing one or more plugins, Datakit can be used to invoke the commands provided by those plugins.
To see which commands plugins provide, try the --help
flag:
$ datakit --help
Install datakit-project:
$ pip install datakit-project
The plugin provides a project create
command. You need to specify a Cookiecutter template to use this command, for example the AP's R template:
That's the basic recipe for working with plugins: install, explore, and invoke! [1]
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
[1] | Plugins may also provide more robust docs, so don't forget to check those out when available. |