Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CONFIG-325] Increase ldb download speeds #117

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

zhou-hongyu
Copy link
Contributor

@zhou-hongyu zhou-hongyu commented Aug 3, 2023

  • Replace aws-sdk-go with aws-sdk-go-v2
  • Use concurrent downloading mechanism to download s3 snapshot with hardcoded concurrency configuration
  • replace golang's gzip with klauspost/gizp

@zhou-hongyu zhou-hongyu marked this pull request as ready for review August 4, 2023 13:35
@zhou-hongyu zhou-hongyu requested a review from a team as a code owner August 4, 2023 13:35
@@ -5,12 +5,17 @@ go 1.20
require (
github.com/AlekSi/pointer v1.0.0
github.com/aws/aws-sdk-go v1.37.8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we free ourselves of all aws-sdk-go v1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried getting rid of awserr lib but there is really no equivalence in sdk v2, the most closed one would be smithy-go, but that would still require us making some minor logic changes in the testing file. I can definitely give it another try and you can take a look

Copy link
Contributor Author

@zhou-hongyu zhou-hongyu Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so I freed reflector pkg from v1, but I then realized that there are much more places using v1, e.g., ledger_monitor which uses ecs client. I don't think it's gonna be a trivia task to completely free this repo from v1, would you think that's acceptable?

pkg/reflector/download.go Outdated Show resolved Hide resolved
}
}
n, err = io.Copy(w, reader)

events.Log("LDB inflated %d -> %d bytes", numBytes, n)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n is going to be 0 here if the file isn't gzipped


events.Log("LDB inflated %d -> %d bytes", numBytes, n)

return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the file isn't gzipped, we want n=numBytes

Bucket: aws.String(d.Bucket),
Key: aws.String(d.Key),
})
stats.Observe("snapshot_download_time", time.Now().Sub(start))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know the default string output of time.Now().Sub(start)? If it's ms then should you change the title of the metric to include _ms or whichever unit it is?

return
}

func Unzip(src io.Reader, dest io.Writer) (int64, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to add a metric around this start and end time as well in case? I'm not sure if it would only be in the io.Copy part or not.

@zhou-hongyu zhou-hongyu force-pushed the hzhou/CONFIG-325-increase-ldb-download-speeds branch from 2545d47 to 04be134 Compare August 9, 2023 19:21
@zhou-hongyu zhou-hongyu force-pushed the hzhou/CONFIG-325-increase-ldb-download-speeds branch from 39c1e03 to c410558 Compare August 21, 2023 19:03
@asaf-erlich
Copy link

I know I may have asked this in our last 1-on-1, but do you have any benchmark / at least a snapshot of time of before / after this change? Even if you didn't profile the whole thing? If not should a separate PR be sent out first with just the metrics and then the change so the rough difference can be measured and observed?

@zhou-hongyu
Copy link
Contributor Author

yes totally, the plan will be deploying to a ctlstore-beta node and validate the downloading speed via the dashboard I will circle back on this

@asaf-erlich
Copy link

yes totally, the plan will be deploying to a ctlstore-beta node and validate the downloading speed via the dashboard I will circle back on this

Sounds good

@zhou-hongyu zhou-hongyu force-pushed the hzhou/CONFIG-325-increase-ldb-download-speeds branch from c410558 to 467fb2e Compare August 29, 2023 22:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants