A cookiecutter template for science and data science projects that include data, code, and dissemination.
- Optimized for data-based publications
- Optimized for use with VS Code
- Docker-based, version-controlled environment using VS Code Dev Containers
- uv based environment inside the Dev Container
- to add a package just follow to uv workflow: use the VS code terminal and to go the code folder and run: uv add pandas
- use of Dev container Features with pre-installed, Python andLaTeX
- Setup for use with Python but could also be addapted for Julia, and R
- Make commands for: collecting data, generating, figures, typsetting latex, clean temp files, clean demo files
- use of VS Code tasks to trigger data collection, plotting and paper compilation
- LaTeX-based paper
- Added path definitions in the
project_package
Python module - Kedro-inspired data folder structure
- filled with a demo - which can be cleaned with "make delete_demo"
- used in at least 5 papers
For more detailed information, please see the README of the resulting project.
cookiecutter https://github.com/tgoelles/cookiecutter_science
βββ Makefile # Automation script for common tasks
βββ README.md # Project overview and instructions
βββ code # Python Source code and notebooks
βΒ Β βββ notebooks # Jupyter notebooks for analysis
βΒ Β βΒ Β βββ exploratory # Exploratory data analysis
βΒ Β βΒ Β βββ 1.0-tg-example.ipynb # Example exploratory notebook
βΒ Β βββ project_package # The project package where refined code goes
βΒ Β βββ pyproject.toml # project_package dependencies and configuration
βΒ Β βββ src # Source code directory
βΒ Β βββ project_package #
βΒ Β βββ __init__.py #
βΒ Β βββ data # Data processing module and scripts
βΒ Β βΒ Β βββ __init__.py #
βΒ Β βΒ Β βββ config.py # Configuration settings
βΒ Β βΒ Β βββ example.py # Example script
βΒ Β βΒ Β βββ import_data.py # Data import functions
βΒ Β βΒ Β βββ make_dataset.py # Dataset creation script, used by make data
βΒ Β βββ tools # Utility scripts
βΒ Β βΒ Β βββ __init__.py #
βΒ Β βΒ Β βββ convert_latex.py # LaTeX conversion script
βΒ Β βββ visualization # Visualization module and scripts
βΒ Β βββ __init__.py #
βΒ Β βββ make_plots.py # Plot generation functions
βΒ Β βββ visualize.py # Data visualization utilities
βββ data #
βΒ Β βββ 01_raw # Raw data, do not change the data in there
βΒ Β βΒ Β βββ demo.csv # Example raw data file
βΒ Β βββ 02_intermediate # Processed but unrefined data
βΒ Β βΒ Β βββ demo_clean.csv # Example cleaned data file
βΒ Β βββ 03_primary # Primary processed datasets
βΒ Β βββ 04_feature # Feature-engineered datasets
βΒ Β βββ 05_model_input # Data ready for modeling
βΒ Β βββ 06_models # Trained models
βΒ Β βββ 07_model_output # Model predictions/results
βΒ Β βββ 08_reporting # Reports and summaries
βββ dissemination # Outputs for publication/presentation
βΒ Β βββ figures # Figures and plots go in here
βΒ Β βΒ Β βββ demo.png # Example figure
βΒ Β βββ papers # LaTeX desimition for paper or Thesis
βΒ Β βΒ Β βββ paper.pdf # Final paper output
βΒ Β βΒ Β βββ paper.tex # LaTeX source for the paper
βΒ Β βββ presentations # Presentation slides and materials
βββ literature # References and related work
βΒ Β βββ references.bib # Bibliography file
βββ pyproject.toml # All Project dependencie and tool settings, managed by uv
βββ uv.lock # Dependency lock file for reproducibility
Use of VS Code tasks:
- Git: Should be part of your OS or install it here
- GitHub account
- GitHub CLI: Install from here
- Docker Desktop: Install from here
- VS Code: Install from here
- VS Code Extension: Remote Development: Install from here
- Cookiecutter Python package: Install like this:
pip install cookiecutter
For Mac users:
brew install cookiecutter
-
Navigate to the folder where you want to create the project (on your local drive):
cookiecutter https://github.com/tgoelles/cookiecutter_science
-
Answer the questions prompted by cookiecutter.
-
A new VS Code window will open automatically.
-
Click "OK" to reopen the folder in a container (only asked the first time).
-
Read the README.md in the generated project folder.
Cookiecutter can generate a GitHub repository for you. This initializes the git repo and pushes it to GitHub. You can then invite your team members to join the project.
- Each team member works on their local version of the project, regularly committing and pushing changes.
- Avoid working on the same folder over a network.
If you want to use git inside the container (recommended), you need to clone the repo from WSL, as Windows may mess up the .git
folder. Git inside the container uses the same .gitconfig
as Windows, which is copied into the container.
Ensure user.email
and user.name
are set (in PowerShell):
git config --global user.name "your_name"
git config --global user.email "[email protected]"