Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PLG Cloud.gov #3192

Open
wants to merge 80 commits into
base: develop
Choose a base branch
from
Open

PLG Cloud.gov #3192

wants to merge 80 commits into from

Conversation

elipe17
Copy link

@elipe17 elipe17 commented Sep 18, 2024

Summary of Changes

  • Add manifests to deploy PLG as binaries via binary buildpack
  • Add local vs deployed configurations for PLG apps
  • Added routing based on dev environment for the interim
  • Added deploy script for PLG and PG exporters
  • Fixed local Nginx proxying to Grafana

Pull request closes #3046

Considerations:

  • We need to move/should we move all PLG apps into the production env?
  • We need to actually calculate memory requirements for PLG. My assumption is Loki: 2-4GB, Prometheus: 2-4GB (calculator), Grafana: 2GB, 3 PG exporters: 324MB, 6 Backend Promtails: 664MB
  • We can/should hook Grafana up to our RDS to have a more persistent set of datasources, dashboards, etc...
  • Need to play around with cloud.gov provisioned dashboards. I couldn't get it to work so I just uploaded them manually.
  • Need to re-evaluate getting promtail to run with frontend
  • Need to update/add our Loki pipelines to parse log messages better so we can filter our logs dashboard more effectively
  • Consider moving deployment of all apps to docker containers to make life with PLG and frontend/backend deployment a bit easier.
  • Need to verify logs in Loki are queryable from days, weeks, months, etc ago
  • While TDP lives in cloud.gov it might be worth it to see if we can setup a log drain through cloud.gov to loki. Then we should get all of the cloud.gov logs also.
  • Consider adding metric exporting for Elastic
  • Consider adding metric exporting for machine stats (cpu, memory, etc)
  • Consider using Grafana Alerting vs AlertManager

How to Test

PLG is deployed in the dev environment for the moment. If you would like to browse Grafana I have opened a public route to it for the interim until we decide where PLG is going to live. Reach out to me for username and password. Once you're logged in, feel free to browse the dashboards. Note, the Logs dashboard only has logging information as far back as 09/25/2024 at ~9:40am ET since that is when promtail had it's first successful exports.

Deliverables

More details on how deliverables herein are assessed included here.

Deliverable 1: Accepted Features

Checklist of ACs:

  • Prometheus, Loki, and Grafana are deployed in cloud.gov
  • Grafana is connected to Promethues and Loki and can query the logs from Loki
  • Backend apps can push logs to Loki
  • Prometheus can pull metrics from backend apps and postgres exporters
  • Loki has persistent log storage via S3
  • Testing Checklist has been run and all tests pass
  • README is updated, if necessary

Deliverable 2: Tested Code

  • Are all areas of code introduced in this PR meaningfully tested?
    • If this PR introduces backend code changes, are they meaningfully tested?
    • If this PR introduces frontend code changes, are they meaningfully tested?
  • Are code coverage minimums met?
    • Frontend coverage: [insert coverage %] (see CodeCov Report comment in PR)
    • Backend coverage: [insert coverage %] (see CodeCov Report comment in PR)

Deliverable 3: Properly Styled Code

  • Are backend code style checks passing on CircleCI?
  • Are frontend code style checks passing on CircleCI?
  • Are code maintainability principles being followed?

Deliverable 4: Accessible

  • Does this PR complete the epic?
  • Are links included to any other gov-approved PRs associated with epic?
  • Does PR include documentation for Raft's a11y review?
  • Did automated and manual testing with iamjolly and ttran-hub using Accessibility Insights reveal any errors introduced in this PR?

Deliverable 5: Deployed

  • Was the code successfully deployed via automated CircleCI process to development on Cloud.gov?

Deliverable 6: Documented

  • Does this PR provide background for why coding decisions were made?
  • If this PR introduces backend code, is that code easy to understand and sufficiently documented, both inline and overall?
  • If this PR introduces frontend code, is that code easy to understand and sufficiently documented, both inline and overall?
  • If this PR introduces dependencies, are their licenses documented?
  • Can reviewer explain and take ownership of these elements presented in this code review?

Deliverable 7: Secure

  • Does the OWASP Scan pass on CircleCI?
  • Do manual code review and manual testing detect any new security issues?
  • If new issues detected, is investigation and/or remediation plan documented?

Deliverable 8: User Research

Research product(s) clearly articulate(s):

  • the purpose of the research
  • methods used to conduct the research
  • who participated in the research
  • what was tested and how
  • impact of research on TDP
  • (if applicable) final design mockups produced for TDP development

@elipe17 elipe17 self-assigned this Sep 18, 2024
Copy link

codecov bot commented Sep 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.46%. Comparing base (a981311) to head (e1696da).
Report is 6 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3192      +/-   ##
===========================================
- Coverage    92.66%   89.46%   -3.21%     
===========================================
  Files           47      296     +249     
  Lines         1009     8655    +7646     
  Branches       169      819     +650     
===========================================
+ Hits           935     7743    +6808     
- Misses          42      791     +749     
- Partials        32      121      +89     
Flag Coverage Δ
dev-backend 89.04% <ø> (?)
dev-frontend 92.66% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
tdrs-backend/tdpservice/settings/common.py 99.33% <ø> (ø)

... and 248 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c5f87eb...e1696da. Read the comment docs.

@elipe17 elipe17 added the Deploy with CircleCI-raft Deploy to https://tdp-frontend-raft.app.cloud.gov through CircleCI label Sep 19, 2024
@elipe17 elipe17 added Deploy with CircleCI-raft Deploy to https://tdp-frontend-raft.app.cloud.gov through CircleCI and removed Deploy with CircleCI-raft Deploy to https://tdp-frontend-raft.app.cloud.gov through CircleCI labels Sep 19, 2024
@elipe17 elipe17 marked this pull request as ready for review September 19, 2024 14:38
- update volume mount
- update promtail to scrape new location
- update backend log file location
- Remove docker container scrape config
- Testing running bogus command for nginx
@elipe17 elipe17 added the Deploy with CircleCI-raft Deploy to https://tdp-frontend-raft.app.cloud.gov through CircleCI label Sep 25, 2024
@elipe17 elipe17 added Deploy with CircleCI-raft Deploy to https://tdp-frontend-raft.app.cloud.gov through CircleCI and removed Deploy with CircleCI-raft Deploy to https://tdp-frontend-raft.app.cloud.gov through CircleCI labels Sep 25, 2024
popd
}

update_plg_networking() {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want PLG to live in prod? If so, need to update networking.

- add retention for prometheus
- change promtail log level
@elipe17 elipe17 added Deploy with CircleCI-raft Deploy to https://tdp-frontend-raft.app.cloud.gov through CircleCI and removed Deploy with CircleCI-raft Deploy to https://tdp-frontend-raft.app.cloud.gov through CircleCI labels Sep 26, 2024
@elipe17 elipe17 added Deploy with CircleCI-raft Deploy to https://tdp-frontend-raft.app.cloud.gov through CircleCI and removed Deploy with CircleCI-raft Deploy to https://tdp-frontend-raft.app.cloud.gov through CircleCI labels Sep 26, 2024
@elipe17 elipe17 added the raft review This issue is ready for raft review label Sep 30, 2024
@raftmsohani
Copy link

raftmsohani commented Oct 2, 2024

So that we all don't have to deploy to test this, could you possibly post screen shots from running command in steps (only for changes to deploy command)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deploy with CircleCI-raft Deploy to https://tdp-frontend-raft.app.cloud.gov through CircleCI raft review This issue is ready for raft review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PLG deployed in Cloud.gov
2 participants