Skip to content

Commit

Permalink
Merge pull request #35 from brave/ShivanKaul-patch-1
Browse files Browse the repository at this point in the history
Update README.md
  • Loading branch information
ShivanKaul authored Nov 27, 2023
2 parents a1d99d0 + 6b64df1 commit 7e4a94f
Showing 1 changed file with 7 additions and 47 deletions.
54 changes: 7 additions & 47 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,55 +26,15 @@ The tests are defined in `test/test.js`. Test parameters are defined in `test/co

Usage
---
Since [PageGraph](https://github.com/brave/brave-browser/wiki/PageGraph) is built as part of Brave, you can simply point the binary path to be your local installation.

```bash
npm run crawl -- -b /Applications/Brave\ Browser\ Nightly.app/Contents/MacOS/Brave\ Browser\ Nightly -u https://brave.com -t 5 -o output/ --debug debug
```
$ npm run crawl -- -h
> [email protected] crawl
> node ./built/run.js
usage: run.js [-h] [-v] [-b BINARY] [-r RECURSIVE_DEPTH] -o OUTPUT -u URL
[URL ...] [-e EXISTING_PROFILE] [-p PERSIST_PROFILE]
[-s {up,down}] [-t SECS] [--debug {none,debug,verbose}] [-i]
[-a USER_AGENT] [--proxy-server URL] [-x JSON_ARRAY]

The `-t` specifies how many seconds to crawl the URL provided in `-u` using the PageGraph binary in `-b`.

CLI tool for crawling and recording websites with PageGraph
Optional arguments:
-h, --help Show this help message and exit.
-v, --version Show program's version number and exit.
-b BINARY, --binary BINARY
Path to the PageGraph enabled build of Brave. If not
provided, try to guess where the binary is, or if its
in $PATH
-r RECURSIVE_DEPTH, --recursive-depth RECURSIVE_DEPTH
If provided, choose a link at random on page and do
another crawl to this depth. Default: 1 (no
recursion).
-o OUTPUT, --output OUTPUT
Path (directory) to write graphs to.
-u URL [URL ...], --url URL [URL ...]
The URLs(s) to record, in desired order (currently
only crawls the first URL)
-e EXISTING_PROFILE, --existing-profile EXISTING_PROFILE
The chromium profile to use when crawling. Cannot be
used with "--persist-profile"
-p PERSIST_PROFILE, --persist-profile PERSIST_PROFILE
If provided, the user profile will be saved at this
path. Cannot be used with "--existing-profile"
-s {up,down}, --shields {up,down}
Whether to measure with shields up or down. Ignored
when using "--existing-profile". Default: down
-t SECS, --secs SECS The dwell time in seconds. Defaults: 30 sec.
--debug {none,debug,verbose}
Print debugging information. Default: none.
-i, --interactive Suppress use of Xvfb to allow interaction with
spawned browser instance
-a USER_AGENT, --user-agent USER_AGENT
Override the browser's UserAgent string to USER_AGENT
--proxy-server URL Use an HTTP/SOCKS proxy at URL for all navigations
-x JSON_ARRAY, --extra-args JSON_ARRAY
Pass JSON_ARRAY as extra CLI argument to the browser
instance launched
You can see all supported options:
```bash
$ npm run crawl -- -h
```

0 comments on commit 7e4a94f

Please sign in to comment.