Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a database bootstrap guide #9390

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

nirbosl
Copy link
Contributor

@nirbosl nirbosl commented Sep 17, 2024

Description:

  • Added a new script named bootstrap.sh which is used to import the exported mirrornode content found in the mirrornode-db-export bucket
  • Added a new document named bootstrap.md which contains a guide with info and instructions on how to setup a fresh mirrornode DB, and how to import the exported data available in the bucket linked above.
  • Updated docs/database/README.md with a paragraph linking to bootstrap.md

Related issue(s):

Checklist

  • Documented
    • In-script comments
    • bootstrap.md as a how-to guide
  • Tested by importing a small dataset taken from an actual mainnet DB export

…cript, and linking the bootstrap guide to the main DB README.md doc

Signed-off-by: Nir Ben-Or <[email protected]>
Signed-off-by: Nir Ben-Or <[email protected]>
Copy link

codecov bot commented Sep 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.56%. Comparing base (b4522b8) to head (e526a4f).
Report is 6 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #9390      +/-   ##
============================================
+ Coverage     92.55%   92.56%   +0.01%     
- Complexity     7039     7047       +8     
============================================
  Files           912      914       +2     
  Lines         29760    29810      +50     
  Branches       3760     3767       +7     
============================================
+ Hits          27544    27595      +51     
  Misses         1445     1445              
+ Partials        771      770       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@nirbosl nirbosl marked this pull request as ready for review September 18, 2024 16:37
@nirbosl
Copy link
Contributor Author

nirbosl commented Sep 18, 2024

I've tested the import script using a small dataset and it has completed successfully.

@steven-sheehy steven-sheehy changed the title Initial commit, adding the bootstrap from export guide, bootstrap.sh script, and linking the bootstrap guide to the main DB README.md doc Add a database bootstrap guide Sep 18, 2024
@steven-sheehy steven-sheehy added documentation Type: Improvements or additions to documentation database Area: Database labels Sep 18, 2024
@steven-sheehy steven-sheehy requested a review from a team September 18, 2024 17:06
docs/database/bootstrap.md Outdated Show resolved Hide resolved
docs/database/bootstrap.md Outdated Show resolved Hide resolved
docs/database/bootstrap.md Outdated Show resolved Hide resolved
docs/database/bootstrap.md Show resolved Hide resolved
docs/database/bootstrap.md Outdated Show resolved Hide resolved
docs/database/bootstrap.md Outdated Show resolved Hide resolved
Co-authored-by: Steven Sheehy <[email protected]>
Signed-off-by: Nir Ben-Or <[email protected]>
@steven-sheehy steven-sheehy added this to the 0.115.0 milestone Sep 20, 2024
…voke of the extra grant for GCP cloud sql instances

Signed-off-by: Nir Ben-Or <[email protected]>
Comment on lines +72 to +73
export PGHOST="DB_IP_ADDRESS"
export PGPORT="DB_PORT"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to show common examples for these instead of replacement text.

Suggested change
export PGHOST="DB_IP_ADDRESS"
export PGPORT="DB_PORT"
export PGHOST="127.0.0.1"
export PGPORT="5432"


### 3. Run the Initialization Script

Download the initialization script [`init.sh`](../../hedera-mirror-importer/src/main/resources/db/scripts/init.sh) from the repository:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant relative to root, not current directory.

Suggested change
Download the initialization script [`init.sh`](../../hedera-mirror-importer/src/main/resources/db/scripts/init.sh) from the repository:
Download the initialization script [`init.sh`](/hedera-mirror-importer/src/main/resources/db/scripts/init.sh) from the repository:

- [3. Run the Import Script](#3-run-the-import-script)
- [Handling Failed Imports](#handling-failed-imports)
- [Steps to Handle Failed Imports:](#steps-to-handle-failed-imports)
- [Additional Notes](#additional-notes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ToC doesn't seem to be updated to reflect the headers. Please scan through all sections and update.

Comment on lines +98 to +105
export GRAPHQL_PASSWORD="SET_PASSWORD"
export GRPC_PASSWORD="SET_PASSWORD"
export IMPORTER_PASSWORD="SET_PASSWORD"
export OWNER_PASSWORD="SET_PASSWORD"
export REST_PASSWORD="SET_PASSWORD"
export REST_JAVA_PASSWORD="SET_PASSWORD"
export ROSETTA_PASSWORD="SET_PASSWORD"
export WEB3_PASSWORD="SET_PASSWORD"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better experience would be to create a bootstrap.env with all the exports and ask the user to just adjust all defaults. Then inside bootstrap.sh run source ./bootstrap.env.

Download the initialization script [`init.sh`](../../hedera-mirror-importer/src/main/resources/db/scripts/init.sh) from the repository:

```bash
curl -O https://raw.githubusercontent.com/hashgraph/hedera-mirror-node/main/hedera-mirror-importer/src/main/resources/db/scripts/init.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically they should download the init.sh version that's specified in MIRRORNODE_VERSION

Import the database schema:

```bash
psql -f schema.sql
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's confusing that both schema.sql and MIRRORNODE_VERSION files are referenced in the docs before they are requested to be downloaded. Users following the steps sequentially will get stuck.

**Important Notes:**

- The bucket is **read-only** to the public.
- It is configured as **Requester Pays**, meaning you need a GCP account with a valid billing account attached to download the data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful to link to the Hedera docs that show how to create such an account.

Comment on lines +240 to +250
**Detach from the `screen` Session:**

Press `Ctrl+A` then `D`.

- This allows the import process to continue running in the background.

**Reattach to the `screen` Session Later:**

```bash
screen -r db_import
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screen/tmux might be a bit more complex then just adding nohup or disown to the bootstrap.sh command.

After the script completes, check the exit status:

```bash
echo "EXIT STATUS: $?"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only works if they are running it in the foreground. Better to have them check the final few lines of the log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database Area: Database documentation Type: Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide periodic snapshots of entity tables Add ability to start a mirror node from a network state snapshot
3 participants