A high-performance code scanning tool written in Rust that detects licenses, copyrights, and other relevant metadata in source code.
scancode-rust
is designed to be a faster alternative to the Python-based ScanCode Toolkit, aiming to produce compatible output formats while delivering significantly improved performance. This tool currently scans codebases to identify:
- License information
- File metadata
- System information
More ScanCode features coming soon!
- Efficient file scanning with multi-threading
- Compatible output format with ScanCode Toolkit
- Progress indication for large scans
- Configurable scan depth
- File/directory exclusion patterns
You can download the appropriate binary for your platform from the GitHub Releases page. Simply extract the binary and place it in your system's PATH.
Alternatively, you can use the scancode-rust-installer.sh
script to automatically download and install the correct binary for your architecture and platform:
curl -sSfL https://github.com/mstykow/scancode-rust/releases/latest/download/scancode-rust-installer.sh | sh
git clone https://github.com/yourusername/scancode-rust.git
cd scancode-rust
./setup.sh # Initialize the submodule and configure sparse checkout
cargo build --release
The compiled binary will be available at target/release/scancode-rust
.
scancode-rust [OPTIONS] <DIR_PATH> --output-file <OUTPUT_FILE>
Options:
-o, --output-file <OUTPUT_FILE> Output JSON file path
-d, --max-depth <MAX_DEPTH> Maximum directory depth to scan [default: 50]
-e, --exclude <EXCLUDE>... Glob patterns to exclude from scanning
-h, --help Print help
-V, --version Print version
scancode-rust ~/projects/my-codebase -o scan-results.json --exclude "*.git*" "target/*" "node_modules/*"
scancode-rust
is designed to be significantly faster than the Python-based ScanCode Toolkit, especially for large codebases. Performance improvements come from:
- Native Rust implementation
- Efficient parallel processing
- Optimized file handling
The tool produces JSON output compatible with ScanCode Toolkit, including:
- Scan headers with timestamp information
- File-level data with license and metadata information
- System environment details
Contributions are welcome! Please feel free to submit a Pull Request.
To contribute to scancode-rust
, follow these steps to set up the repository for local development:
-
Install Rust
Ensure you have Rust installed on your system. You can install it using rustup:curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
-
Clone the Repository
Clone thescancode-rust
repository to your local machine:git clone https://github.com/mstykow/scancode-rust.git cd scancode-rust
-
Initialize the License Submodule
Use the following script to initialize the submodule and configure sparse checkout:./setup.sh
-
Install Dependencies
Install the required Rust dependencies usingcargo
:cargo build
-
Run Tests
Run the test suite to ensure everything is working correctly:cargo test
-
Set Up Pre-commit Hooks
This repository uses pre-commit to run checks before each commit:# Using pip pip install pre-commit # Or using brew on macOS brew install pre-commit # Install the hooks pre-commit install
-
Start Developing
You can now make changes and test them locally. Usecargo run
to execute the tool:cargo run -- [OPTIONS] <DIR_PATH>
This project uses cargo-dist to automate the release process for both GitHub releases and crates.io.
-
Install cargo-dist:
cargo install cargo-dist
-
Ensure you have the necessary permissions on the GitHub repository and for crates.io.
-
Authenticate with GitHub CLI (
gh
) and ensure you're logged in to crates.io:gh auth login cargo login
-
Update version in
Cargo.toml
:# Edit Cargo.toml to bump the version number vim Cargo.toml
-
Create a new git tag matching the version:
git add Cargo.toml git commit -m "Bump version to x.y.z" git tag -a vx.y.z -m "Release version x.y.z"
-
Push the tag to trigger the release workflow:
git push origin main --tags
-
The GitHub Actions workflow will:
- Build binaries for all supported platforms
- Create a GitHub release with the binaries
-
Monitor the GitHub Actions workflow to ensure the GitHub release completes successfully.
-
Publish to crates.io manually:
cargo publish
If you want to update the embedded license data, simply run the setup.sh
script:
./setup.sh
This will reconfigure the sparse checkout and fetch the latest changes. After updating the license data, rebuild the binary:
cargo build --release
This will embed the latest changes from the license-list-data
repository into the binary.
This project is licensed under the Apache License 2.0.