GitHub

Activism Tools Against Cybertorture

Q.A. Metrics

Codacy

DeepSource

Summary

The current scraping algorithm is the following

P ← starting URLs (primary queue)
S ← ∅ (secondary queue)
V ← ∅ (visited pages)
while P ≠ ∅ do
    Pick a page v from P and download it
    V ← V ∪ {v} (mark as visited)
    N+(v) ← v’s out-links pointing to new pages (“new” means not in P, S or V)
    if |N+(v)| > t then
        R ← first t out-links N+(v)
        S ← S ∪ (v)
    end if
    P ← P ∪ R
    if P = ∅ then
        P ← S
        S ← ∅
    end if

it is based on https://chato.cl/papers/castillo_06_controlling_queue_size_web_crawling.pdf

Usage

Create a virtual environment

python3 -m venv env

Activate virtual environment POSIX (bash)

source <venv>/bin/activate

or

Activate virtual environment Windows (Powershell)

PS C:\> <venv>\Scripts\Activate.ps1

Install dependencies

pip3 install -r requirements.txt

Create contacts

python3 atac.py scrape -u url_to_scrape

Send Email to contacts created above

python3 atac.py email -p path_to_csv -m path_to_message -s subject

Send IRC

python3 atac.py irc

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
.github		.github
atac.egg-info		atac.egg-info
atac		atac
data		data
doc		doc
.deepsource.toml		.deepsource.toml
.flake8		.flake8
.gitignore		.gitignore
.gitmodules		.gitmodules
MOTIVATION.md		MOTIVATION.md
PyWhatKit_DB.txt		PyWhatKit_DB.txt
README.md		README.md
SECURITY.md		SECURITY.md
atac.py		atac.py
castillo_06_controlling_queue_size_web_crawling.pdf		castillo_06_controlling_queue_size_web_crawling.pdf
cobertura.xml		cobertura.xml
prospector.yaml		prospector.yaml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

Activism Tools Against Cybertorture

GitHub

Q.A. Metrics

Codacy

DeepSource

Summary

The current scraping algorithm is the following

Usage

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Languages

Uh oh!

neuro-rights/atac

Folders and files

Latest commit

History

Repository files navigation

Activism Tools Against Cybertorture

GitHub

Q.A. Metrics

Codacy

DeepSource

Summary

The current scraping algorithm is the following

Usage

About

Resources

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Languages

Packages