Skip to content

cycle-sync-ai/deep-learning-project-directory-structure

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

project_name/
│
├── data/                          # All data-related directories and files
│   ├── raw/                       # Original, immutable data dump
│   ├── processed/                 # Data transformed for modeling, such as TFRecords or HDF5
│   ├── interim/                   # Temporary processed datasets, possibly smaller or for prototyping
│
├── docs/                          # Project documentation
│   ├── api/                       # API reference
│   ├── design_docs/               # Design documents, architectural choices
│   ├── tutorials/                 # Step-by-step guides and tutorials
│   └── diagrams/                  # Network architecture diagrams, flowcharts
│
├── notebooks/                     # Jupyter notebooks
│   ├── eda/                       # Exploratory Data Analysis notebooks
│   ├── model_experiments/         # Prototyping and experiments
│   └── results/                   # Visualization of final results, comparisons
│
├── src/                           # Source code
│   ├── data_pipeline/             # All modules related to the data pipeline
│   │   ├── sourcing/              # Scripts and utilities for sourcing data
│   │   │   ├── downloaders/       # Scripts to download data from various sources
│   │   │   ├── scrapers/          # Web scraping scripts or API-based scripts
│   │   │   ├── validators/        # Scripts to validate data integrity after sourcing
│   │   │   └── annotations/       # Tools and scripts related to data annotations or labeling
│   │   │
│   │   ├── preprocessing/         # Data preprocessing modules
│   │   │   ├── cleaners/          # Scripts to clean data, handle missing values, outliers, etc.
│   │   │   ├── transformers/      # Transformation scripts (e.g., normalization, tokenization)
│   │   │   ├── encoders/          # Encoding techniques, like one-hot, label encoding, etc.
│   │   │   └── splitters/         # Scripts to split datasets into train/test/validation sets
│   │   │
│   │   ├── augmentations/         # Data augmentation modules
│   │   │   ├── image/             # Image data augmentation (e.g., rotations, flips)
│   │   │   ├── text/              # Text data augmentation (e.g., back translation, synonym replacement)
│   │   │   ├── audio/             # Audio data augmentation (e.g., noise injection, speed change)
│   │   │   └── utils/             # Common utilities for augmentation
│   │   │
│   │   ├── loaders/               # Data loading utilities for feeding to models
│   │   │   ├── dataset_classes/   # Framework-specific dataset classes (e.g., TensorFlow's tf.data.Dataset, PyTorch's Dataset)
│   │   │   ├── generators/        # Data generators for on-the-fly data feeding and augmentation
│   │   │   ├── batchers/          # Batch data and manage batch-related operations
│   │   │   └── async_loaders/     # Asynchronous data loading, especially useful for large datasets
│   │   │
│   │   ├── validation/            # Data validation post-processing
│   │   │   ├── sanity_checks/     # Scripts to check data sanity post-processing
│   │   │   ├── statistics/        # Scripts to compute and visualize data statistics
│   │   │   └── comparers/         # Compare data before and after preprocessing
│   │   │
│   │   └── utils/                 # Common utilities for the data pipeline
│   │       ├── visualization/     # Tools to visualize samples, augmentations, etc.
│   │       ├── logging/           # Data processing related logging utilities
│   │       └── helpers/           # Miscellaneous helper functions and scripts
│   │
│   ├── models/                    # Neural network models and components
│   │   ├── architectures/         # Different neural network architectures
│   │   ├── layers/                # Custom layers
│   │   ├── losses/                # Custom loss functions
│   │   └── metrics/               # Custom evaluation metrics
│   │
│   ├── training/
│   │   ├── experiments/           # Specific experiment settings and results
│   │   │   ├── experiment1/       # Each experiment can have its own subdirectory
│   │   │   │   ├── config.yml     # Configuration file for the experiment
│   │   │   │   ├── logs/          # Training logs for this experiment
│   │   │   │   └── results/       # Resulting plots, performance metrics, etc.
│   │   │   ├── experiment2/
│   │   │   └── ...
│   │   │
│   │   ├── scripts/               # Actual scripts to run training
│   │   │   ├── train_model.py     # Main training script
│   │   │   ├── validate_model.py  # Validation script
│   │   │   ├── resume_training.py # Script to resume training from checkpoints
│   │   │   └── distributed_train.py # For distributed and parallel training setups
│   │   │
│   │   ├── hyperparameters/       # Hyperparameter tuning and optimization
│   │   │   ├── search_algos/      # Algorithms for hyperparameter search
│   │   │   │   ├── grid_search.py
│   │   │   │   ├── random_search.py
│   │   │   │   └── bayesian_optimization.py
│   │   │   ├── search_spaces/     # Definitions of hyperparameter search spaces
│   │   │   └── tuning_results/    # Results from hyperparameter tuning runs
│   │   │
│   │   ├── callbacks/             # Custom callbacks used during training
│   │   │   ├── lr_schedulers/     # Learning rate scheduling callbacks
│   │   │   ├── early_stopping.py  # Early stopping callback
│   │   │   ├── model_checkpointing.py # Save model checkpoints during training
│   │   │   └── tensorboard_logging.py # Log metrics and other details to TensorBoard
│   │   │
│   │   ├── strategies/            # Training strategies for different setups
│   │   │   ├── single_gpu_strategy.py # Training strategy for a single GPU setup
│   │   │   ├── multi_gpu_strategy.py  # Multi-GPU training strategy
│   │   │   └── tpu_strategy.py    # TPU training strategy
│   │   │
│   │   ├── metrics/               # Metrics to evaluate model during training
│   │   │   ├── accuracy.py        # Accuracy metric
│   │   │   ├── f1_score.py        # F1 Score
│   │   │   └── custom_metric.py   # Any other custom metrics
│   │   │
│   │   └── utils/                 # Miscellaneous utilities for training
│   │       ├── gradient_clipping.py   # Gradient clipping utilities
│   │       ├── weight_initialization.py # Weight initialization strategies
│   │       └── mixed_precision.py # Mixed precision training utilities
│   │
│   ├── inference/                 # Inference related modules
│   │   ├── deployment/            # Deployment related utilities
│   │   │   ├── docker/            # Docker files for containerized deployment
│   │   │   ├── cloud_functions/   # Cloud function code for serverless deployment
│   │   │   ├── api_endpoints/     # RESTful API endpoints
│   │   │   └── edge_devices/      # Deployment scripts for edge devices (like Raspberry Pi, etc.)
│   │   │
│   │   ├── tools/                # Inference tools
│   │   │   ├── model_converters/  # Convert models to different formats (ONNX, TensorFlow Lite, etc.)
│   │   │   ├── benchmarking/      # Benchmarking the model's performance in real-world scenarios
│   │   │   └── visualization/     # Visualize predictions, attention maps, etc.
│   │   │
│   │   └── utils/                 # Common utilities for inference
│   │       ├── preprocessing/     # Preprocess input data for inference
│   │       ├── postprocessing/    # Convert raw model outputs to interpretable results
│   │       └── logging/           # Inference related logging utilities
│   │
│   ├── testing/
│   │   ├── unit_tests/            # Unit tests for individual functions and components
│   │   │   ├── data_pipeline_tests/
│   │   │   ├── model_tests/
│   │   │   └── training_utils_tests/
│   │   │
│   │   ├── integration_tests/     # Tests that check the interactions between modules
│   │   │   ├── pipeline_integration/
│   │   │   └── model_training_integration/
│   │   │
│   │   └── utils/                 # Utilities for testing
│   │       ├── fixtures/          # Sample data, models, etc. to be used during tests
│   │       ├── mockers/           # Mock functions, classes, etc. for testing
│   │       └── visualization/     # Visualize results, errors, etc. during tests
│   │
│   └── utils/                     # Miscellaneous utilities
│       ├── file_handlers/         # Handle file read/write operations
│       ├── visualization/         # Common visualization tools
│       └── others/                # Any other miscellaneous utilities
│
├── results/                       # Final results and outputs
│   ├── models/                    # Trained models, weights, checkpoints
│   ├── plots/                     # Resulting plots and visualizations
│   └── tables/                    # Resulting tables, usually in CSV or Excel format
│
├── config/                        # Configuration files and environment settings
│   ├── environment.yml            # Conda environment file
│   ├── requirements.txt           # Python package requirements file
│   └── model_config.yml           # Model and training configuration file
│
└── README.md                      # Overview and instructions for the project