Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profile memory consumption of harvest runner under load #4875

Open
3 tasks
btylerburton opened this issue Sep 3, 2024 · 1 comment
Open
3 tasks

Profile memory consumption of harvest runner under load #4875

btylerburton opened this issue Sep 3, 2024 · 1 comment
Labels
H2.0/Harvest-Runner Harvest Source Processing for Harvesting 2.0

Comments

@btylerburton
Copy link
Contributor

User Story

In order to identify memory leaks in the harvester, datagovteam wants to conduct a formal analysis using industry standard memory profiling tools.

Acceptance Criteria

[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]

  • GIVEN I have integrated a memory profiler into the datagov-harvester codebase
    AND it has been tested locally and show to perform
    THEN I want to push it to cloud.gov and run a proper load test of a large harvest source, such as DOI.

Background

[Any helpful contextual notes or links to artifacts/evidence, if needed]

Security Considerations (required)

[Any security concerns that might be implicated in the change. "None" is OK, just be explicit here!]

Sketch

  • Select a memory profiler library
  • Confirm it outputs meaningful data locally
  • Push to cloud.gov
  • Test on a small source
  • Confirm output is meaningful
  • Test on DOI source
@btylerburton btylerburton added the H2.0/Harvest-Runner Harvest Source Processing for Harvesting 2.0 label Sep 3, 2024
@btylerburton
Copy link
Contributor Author

Discussed whether breaking the harvest up into discrete processes: extract, transform, validate, sync/load will make this a non-issue. We can leave this ticket as a might-do just in case that doesn't work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
H2.0/Harvest-Runner Harvest Source Processing for Harvesting 2.0
Projects
Status: H2.0 Backlog
Development

No branches or pull requests

1 participant