From 6b64df16935e21c48ebb4f7b66b39a075df84862 Mon Sep 17 00:00:00 2001 From: Shivan Kaul Sahib Date: Mon, 27 Nov 2023 13:44:36 -0800 Subject: [PATCH] Update README.md --- README.md | 54 +++++++----------------------------------------------- 1 file changed, 7 insertions(+), 47 deletions(-) diff --git a/README.md b/README.md index 48d5218..867bcbe 100644 --- a/README.md +++ b/README.md @@ -26,55 +26,15 @@ The tests are defined in `test/test.js`. Test parameters are defined in `test/co Usage --- +Since [PageGraph](https://github.com/brave/brave-browser/wiki/PageGraph) is built as part of Brave, you can simply point the binary path to be your local installation. +```bash +npm run crawl -- -b /Applications/Brave\ Browser\ Nightly.app/Contents/MacOS/Brave\ Browser\ Nightly -u https://brave.com -t 5 -o output/ --debug debug ``` -$ npm run crawl -- -h - -> pagegraph-crawl@1.0.0 crawl -> node ./built/run.js - -usage: run.js [-h] [-v] [-b BINARY] [-r RECURSIVE_DEPTH] -o OUTPUT -u URL - [URL ...] [-e EXISTING_PROFILE] [-p PERSIST_PROFILE] - [-s {up,down}] [-t SECS] [--debug {none,debug,verbose}] [-i] - [-a USER_AGENT] [--proxy-server URL] [-x JSON_ARRAY] +The `-t` specifies how many seconds to crawl the URL provided in `-u` using the PageGraph binary in `-b`. -CLI tool for crawling and recording websites with PageGraph - -Optional arguments: - -h, --help Show this help message and exit. - -v, --version Show program's version number and exit. - -b BINARY, --binary BINARY - Path to the PageGraph enabled build of Brave. If not - provided, try to guess where the binary is, or if its - in $PATH - -r RECURSIVE_DEPTH, --recursive-depth RECURSIVE_DEPTH - If provided, choose a link at random on page and do - another crawl to this depth. Default: 1 (no - recursion). - -o OUTPUT, --output OUTPUT - Path (directory) to write graphs to. - -u URL [URL ...], --url URL [URL ...] - The URLs(s) to record, in desired order (currently - only crawls the first URL) - -e EXISTING_PROFILE, --existing-profile EXISTING_PROFILE - The chromium profile to use when crawling. Cannot be - used with "--persist-profile" - -p PERSIST_PROFILE, --persist-profile PERSIST_PROFILE - If provided, the user profile will be saved at this - path. Cannot be used with "--existing-profile" - -s {up,down}, --shields {up,down} - Whether to measure with shields up or down. Ignored - when using "--existing-profile". Default: down - -t SECS, --secs SECS The dwell time in seconds. Defaults: 30 sec. - --debug {none,debug,verbose} - Print debugging information. Default: none. - -i, --interactive Suppress use of Xvfb to allow interaction with - spawned browser instance - -a USER_AGENT, --user-agent USER_AGENT - Override the browser's UserAgent string to USER_AGENT - --proxy-server URL Use an HTTP/SOCKS proxy at URL for all navigations - -x JSON_ARRAY, --extra-args JSON_ARRAY - Pass JSON_ARRAY as extra CLI argument to the browser - instance launched +You can see all supported options: +```bash +$ npm run crawl -- -h ```