wgrep

Web grep: search all rendered resources used by a URI

This node command-line utility uses a headless browser (Puppeteer) to render a webpage and download all resources it may need. These resources including the original HTML are all saved locally which it searches one-by-one for a text string.

Since we are downloading all resources it is easy to determine the total download size.

Features

Search using regular expressions
A screen capture is created (not configurable)

Installation

$ git clone https://github.com/stav/wgrep.git
$ cd wgrep
$ npm install

Usage example

Let's try to find the string "stav" from the repository website on GitHub:

$ wgrep stav https://github.com/stav/wgrep

Calling for "stav" in "output" from "https://github.com/stav/wgrep" with user "undefined"
Looking in "output" for 'stav'
Found 1 files
[ 'output/stav/wgrep/index.html' ]

It was only found in the index.html page.

Now let's see what the total download size was:

$ du -sh output
1.4M    output

Options

$ wgrep --help

Usage: wgrep [options] <regex> <url>

Options:
  -V, --version                output the version number
  -d, --directory <directory>  The output directory (default: "output")
  -u, --username <username>    The user to authenticate as
  -h, --help                   output usage information

Tests

$ npm test

$ npm run test-e2e

Contributing

Please file any issues you have.

If you fix a bug or add new features it would be great to have you fork this repo and submit a pull request:

Fork it (https://github.com/yourname/yourproject/fork)
Create your feature branch (git checkout -b feature/fooBar)
Commit your changes (git commit -am 'Add some fooBar')
Push to the branch (git push origin feature/fooBar)
Create a new Pull Request

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.github/workflows		.github/workflows
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.js		index.js
jest.config.js		jest.config.js
package		package
package-lock.json		package-lock.json
package.json		package.json
wgrep.js		wgrep.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

tests

tests

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

index.js

index.js

jest.config.js

jest.config.js

package

package

package-lock.json

package-lock.json

package.json

package.json

wgrep.js

wgrep.js

Repository files navigation

wgrep

Features

Installation

Usage example

Options

Tests

Contributing

License

About

Releases 2

Packages

Contributors 4

Languages

License

stav/wgrep

Folders and files

Latest commit

History

Repository files navigation

wgrep

Features

Installation

Usage example

Options

Tests

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages