project_name/
│
├── data/ # All data-related directories and files
│ ├── raw/ # Original, immutable data dump
│ ├── processed/ # Data transformed for modeling, such as TFRecords or HDF5
│ ├── interim/ # Temporary processed datasets, possibly smaller or for prototyping
│
├── docs/ # Project documentation
│ ├── api/ # API reference
│ ├── design_docs/ # Design documents, architectural choices
│ ├── tutorials/ # Step-by-step guides and tutorials
│ └── diagrams/ # Network architecture diagrams, flowcharts
│
├── notebooks/ # Jupyter notebooks
│ ├── eda/ # Exploratory Data Analysis notebooks
│ ├── model_experiments/ # Prototyping and experiments
│ └── results/ # Visualization of final results, comparisons
│
├── src/ # Source code
│ ├── data_pipeline/ # All modules related to the data pipeline
│ │ ├── sourcing/ # Scripts and utilities for sourcing data
│ │ │ ├── downloaders/ # Scripts to download data from various sources
│ │ │ ├── scrapers/ # Web scraping scripts or API-based scripts
│ │ │ ├── validators/ # Scripts to validate data integrity after sourcing
│ │ │ └── annotations/ # Tools and scripts related to data annotations or labeling
│ │ │
│ │ ├── preprocessing/ # Data preprocessing modules
│ │ │ ├── cleaners/ # Scripts to clean data, handle missing values, outliers, etc.
│ │ │ ├── transformers/ # Transformation scripts (e.g., normalization, tokenization)
│ │ │ ├── encoders/ # Encoding techniques, like one-hot, label encoding, etc.
│ │ │ └── splitters/ # Scripts to split datasets into train/test/validation sets
│ │ │
│ │ ├── augmentations/ # Data augmentation modules
│ │ │ ├── image/ # Image data augmentation (e.g., rotations, flips)
│ │ │ ├── text/ # Text data augmentation (e.g., back translation, synonym replacement)
│ │ │ ├── audio/ # Audio data augmentation (e.g., noise injection, speed change)
│ │ │ └── utils/ # Common utilities for augmentation
│ │ │
│ │ ├── loaders/ # Data loading utilities for feeding to models
│ │ │ ├── dataset_classes/ # Framework-specific dataset classes (e.g., TensorFlow's tf.data.Dataset, PyTorch's Dataset)
│ │ │ ├── generators/ # Data generators for on-the-fly data feeding and augmentation
│ │ │ ├── batchers/ # Batch data and manage batch-related operations
│ │ │ └── async_loaders/ # Asynchronous data loading, especially useful for large datasets
│ │ │
│ │ ├── validation/ # Data validation post-processing
│ │ │ ├── sanity_checks/ # Scripts to check data sanity post-processing
│ │ │ ├── statistics/ # Scripts to compute and visualize data statistics
│ │ │ └── comparers/ # Compare data before and after preprocessing
│ │ │
│ │ └── utils/ # Common utilities for the data pipeline
│ │ ├── visualization/ # Tools to visualize samples, augmentations, etc.
│ │ ├── logging/ # Data processing related logging utilities
│ │ └── helpers/ # Miscellaneous helper functions and scripts
│ │
│ ├── models/ # Neural network models and components
│ │ ├── architectures/ # Different neural network architectures
│ │ ├── layers/ # Custom layers
│ │ ├── losses/ # Custom loss functions
│ │ └── metrics/ # Custom evaluation metrics
│ │
│ ├── training/
│ │ ├── experiments/ # Specific experiment settings and results
│ │ │ ├── experiment1/ # Each experiment can have its own subdirectory
│ │ │ │ ├── config.yml # Configuration file for the experiment
│ │ │ │ ├── logs/ # Training logs for this experiment
│ │ │ │ └── results/ # Resulting plots, performance metrics, etc.
│ │ │ ├── experiment2/
│ │ │ └── ...
│ │ │
│ │ ├── scripts/ # Actual scripts to run training
│ │ │ ├── train_model.py # Main training script
│ │ │ ├── validate_model.py # Validation script
│ │ │ ├── resume_training.py # Script to resume training from checkpoints
│ │ │ └── distributed_train.py # For distributed and parallel training setups
│ │ │
│ │ ├── hyperparameters/ # Hyperparameter tuning and optimization
│ │ │ ├── search_algos/ # Algorithms for hyperparameter search
│ │ │ │ ├── grid_search.py
│ │ │ │ ├── random_search.py
│ │ │ │ └── bayesian_optimization.py
│ │ │ ├── search_spaces/ # Definitions of hyperparameter search spaces
│ │ │ └── tuning_results/ # Results from hyperparameter tuning runs
│ │ │
│ │ ├── callbacks/ # Custom callbacks used during training
│ │ │ ├── lr_schedulers/ # Learning rate scheduling callbacks
│ │ │ ├── early_stopping.py # Early stopping callback
│ │ │ ├── model_checkpointing.py # Save model checkpoints during training
│ │ │ └── tensorboard_logging.py # Log metrics and other details to TensorBoard
│ │ │
│ │ ├── strategies/ # Training strategies for different setups
│ │ │ ├── single_gpu_strategy.py # Training strategy for a single GPU setup
│ │ │ ├── multi_gpu_strategy.py # Multi-GPU training strategy
│ │ │ └── tpu_strategy.py # TPU training strategy
│ │ │
│ │ ├── metrics/ # Metrics to evaluate model during training
│ │ │ ├── accuracy.py # Accuracy metric
│ │ │ ├── f1_score.py # F1 Score
│ │ │ └── custom_metric.py # Any other custom metrics
│ │ │
│ │ └── utils/ # Miscellaneous utilities for training
│ │ ├── gradient_clipping.py # Gradient clipping utilities
│ │ ├── weight_initialization.py # Weight initialization strategies
│ │ └── mixed_precision.py # Mixed precision training utilities
│ │
│ ├── inference/ # Inference related modules
│ │ ├── deployment/ # Deployment related utilities
│ │ │ ├── docker/ # Docker files for containerized deployment
│ │ │ ├── cloud_functions/ # Cloud function code for serverless deployment
│ │ │ ├── api_endpoints/ # RESTful API endpoints
│ │ │ └── edge_devices/ # Deployment scripts for edge devices (like Raspberry Pi, etc.)
│ │ │
│ │ ├── tools/ # Inference tools
│ │ │ ├── model_converters/ # Convert models to different formats (ONNX, TensorFlow Lite, etc.)
│ │ │ ├── benchmarking/ # Benchmarking the model's performance in real-world scenarios
│ │ │ └── visualization/ # Visualize predictions, attention maps, etc.
│ │ │
│ │ └── utils/ # Common utilities for inference
│ │ ├── preprocessing/ # Preprocess input data for inference
│ │ ├── postprocessing/ # Convert raw model outputs to interpretable results
│ │ └── logging/ # Inference related logging utilities
│ │
│ ├── testing/
│ │ ├── unit_tests/ # Unit tests for individual functions and components
│ │ │ ├── data_pipeline_tests/
│ │ │ ├── model_tests/
│ │ │ └── training_utils_tests/
│ │ │
│ │ ├── integration_tests/ # Tests that check the interactions between modules
│ │ │ ├── pipeline_integration/
│ │ │ └── model_training_integration/
│ │ │
│ │ └── utils/ # Utilities for testing
│ │ ├── fixtures/ # Sample data, models, etc. to be used during tests
│ │ ├── mockers/ # Mock functions, classes, etc. for testing
│ │ └── visualization/ # Visualize results, errors, etc. during tests
│ │
│ └── utils/ # Miscellaneous utilities
│ ├── file_handlers/ # Handle file read/write operations
│ ├── visualization/ # Common visualization tools
│ └── others/ # Any other miscellaneous utilities
│
├── results/ # Final results and outputs
│ ├── models/ # Trained models, weights, checkpoints
│ ├── plots/ # Resulting plots and visualizations
│ └── tables/ # Resulting tables, usually in CSV or Excel format
│
├── config/ # Configuration files and environment settings
│ ├── environment.yml # Conda environment file
│ ├── requirements.txt # Python package requirements file
│ └── model_config.yml # Model and training configuration file
│
└── README.md # Overview and instructions for the project
-
Notifications
You must be signed in to change notification settings - Fork 0
cycle-sync-ai/deep-learning-project-directory-structure
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Deep Learning Project Directory Structure
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published