This project scans a list of hosts and attempts to locate potential archive files (e.g., .zip
, .tar
, .rar
, etc.) on those hosts. It generates likely archive URLs based on known paths, domain name parts, and date-based patterns, then checks if those URLs lead to real archives.
- Go 1.20 (or higher)
- A file containing a list of hosts (one per line)
-
Clone this repository:
git clone https://github.com/dsecuredcom/archive-finder cd archive-finder
-
Initialize or update the Go modules (if needed):
go mod tidy
-
Build the binary:
go build -o archive-finder
./archive-finder -hosts /path/to/hosts_file.txt [options]
-hosts string
Path to the hosts list file. (Required)
-timeout duration
Timeout for HTTP requests (default 60s).-concurrency int
Maximum number of concurrent requests (default 2500).-chunksize int
Chunksize for internal processing (default 500).-verbose
Enable verbose output (default false).
-fasthttp
Use fasthttp instead of net/http for potentially faster requests (default false).
-intensity string
Choose scanning intensity: "small", "medium", or "big" (default "medium"). Controls the built-in wordlists, extensions, and backup folder names.-words string
Comma-separated list of words (overwrites intensity-based words).-extensions string
Comma-separated list of extensions (overwrites intensity-based extensions).-backup-folders string
Comma-separated list of backup folders (overwrites intensity-based folders).
-disable-dynamic-entries
Disable generation of entries based on host (default false).-only-dynamic-entries
Use only dynamically generated entries (default false).-with-host-parts
Generate based on host parts (default false).-with-first-chars
Generate based on first 3-4 chars of first subdomain part (default false).-with-year
Generate based on current year (default false).-with-date
Generate based on current date (default false).
- When using dynamic entries (default behavior or with
-only-dynamic-entries
), you must activate at least one module using the-with-*
flags. - You cannot use both
-disable-dynamic-entries
and-only-dynamic-entries
together.
# Basic scan with default settings
./archive-finder -hosts myhosts.txt
# Verbose output with limited concurrency
./archive-finder -hosts myhosts.txt -verbose -concurrency 1000
# Use only domain-based dynamic entries
./archive-finder -hosts myhosts.txt -only-dynamic-entries -with-host-parts
# Comprehensive scan with all dynamic modules
./archive-finder -hosts myhosts.txt -with-host-parts -with-first-chars -with-year -with-date
# High intensity scan with fasthttp
./archive-finder -hosts myhosts.txt -intensity big -fasthttp -with-host-parts -with-year
- Reads host entries from the provided file
- Generates potential archive URLs based on:
- Static wordlists (controlled by
-intensity
) - Dynamic patterns from domain parts (when
-with-host-parts
is enabled) - First characters of subdomain (when
-with-first-chars
is enabled) - Year-based patterns (when
-with-year
is enabled) - Date-based patterns (when
-with-date
is enabled)
- Static wordlists (controlled by
- Checks each URL to determine if it contains an actual archive
- Reports findings in real-time
- Fork this repository
- Create a new branch:
git checkout -b feature/my-feature
- Make your changes and commit them
- Push to your fork and open a pull request
This project is licensed under the MIT License. Feel free to use or modify for your own purposes.