Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PLG Cloud.gov #3192

Open
wants to merge 80 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
fd4e974
- Add log file rollover
elipe17 Sep 16, 2024
feec150
- Add initial configs for cloud deployments
elipe17 Sep 16, 2024
8805b94
- initial config for grafana deploy
elipe17 Sep 17, 2024
cd4b739
- Remove empty file
elipe17 Sep 17, 2024
c935ce0
- move data sources to template file
elipe17 Sep 18, 2024
b7bd9e4
- general deploy routine for pg and grafana
elipe17 Sep 18, 2024
092df96
- added deploy routine for prometheus
elipe17 Sep 18, 2024
1440b7f
- Added deploy routine for loki
elipe17 Sep 18, 2024
c6edd4a
- Initial update for promtail sidecars
elipe17 Sep 19, 2024
19b48a1
- allow deploy no matter test state
elipe17 Sep 19, 2024
0d2518a
- Update deploy scripts to prepare promtail config
elipe17 Sep 19, 2024
e495aac
- add quotes
elipe17 Sep 19, 2024
8388f40
- Update frontend to write error log to file
elipe17 Sep 19, 2024
57de9d4
-- for faster turnaround
elipe17 Sep 19, 2024
3da3526
- add ignore for file generation
elipe17 Sep 19, 2024
4797ced
- Move limits to per process
elipe17 Sep 19, 2024
6b1eedd
- update disk quota to match backend
elipe17 Sep 19, 2024
30ca7fc
- Uping promtail memory
elipe17 Sep 19, 2024
9926c4d
- Explicitely execute nginx
elipe17 Sep 19, 2024
79d578d
- Testing less memory
elipe17 Sep 19, 2024
5bb32d2
- Tell nginx to reload
elipe17 Sep 19, 2024
2316312
- try removing nginx command
elipe17 Sep 19, 2024
bd60f78
- remove stderr log
elipe17 Sep 19, 2024
7767c17
- try removing extra buildpaack
elipe17 Sep 19, 2024
60827e0
- re-add errorlog pipe
elipe17 Sep 19, 2024
96abb6b
- remove blank line
elipe17 Sep 19, 2024
c44692e
- remove error log for test
elipe17 Sep 19, 2024
e1e272b
- remove resolver directive as test
elipe17 Sep 19, 2024
d8b27d3
- test hard coded vals
elipe17 Sep 19, 2024
17b22f1
- revert conf changes
elipe17 Sep 19, 2024
8c121d9
Merge branch 'develop' into 3046-plg-cloud
elipe17 Sep 19, 2024
dfa6de8
- Testing with latest nginx buildpack
elipe17 Sep 20, 2024
0078a86
- revert to original manifest
elipe17 Sep 20, 2024
4f9d4d5
- revert buildpack and nginx.conf
elipe17 Sep 20, 2024
9f82e14
- test promtail as a sidecar
elipe17 Sep 20, 2024
4527ed6
- Update loki to store logs in s3
elipe17 Sep 20, 2024
764e041
- add bucket name
elipe17 Sep 20, 2024
4a35d01
- add path for local loki directories
elipe17 Sep 20, 2024
d6acc58
- Update path prefix
elipe17 Sep 20, 2024
7d70d19
- Add networking commands for PLG
elipe17 Sep 20, 2024
47fafaf
- alleviate secrets check
elipe17 Sep 20, 2024
7ceb3d9
- UPdated deploy script
elipe17 Sep 20, 2024
cc73759
- update comment in route
elipe17 Sep 23, 2024
9e00a31
- add internal apps to allowed hosts
elipe17 Sep 23, 2024
5e43a1f
- Updated local proxy config to correctly proxy grafana
elipe17 Sep 23, 2024
d08a4c9
- Explicitely mark netpols to route to dev env
elipe17 Sep 23, 2024
bc960b5
- intermediate commit
elipe17 Sep 23, 2024
0a4bbe8
- Updates to deploy script
elipe17 Sep 23, 2024
f3cc7c7
- Update prometheus scrape configs to have all envs
elipe17 Sep 23, 2024
5e64f68
Merge branch 'develop' of https://github.com/raft-tech/TANF-app into …
elipe17 Sep 23, 2024
fb7f807
- Remove promtail sidecar from frontend
elipe17 Sep 23, 2024
8cc8430
- remove manifest tremplate usage
elipe17 Sep 23, 2024
98e1cb2
- remove env expansion from loki
elipe17 Sep 23, 2024
6b3ab12
- Give loki a local config for comparison
elipe17 Sep 23, 2024
e24b3cd
- add db size visualizaiton
elipe17 Sep 24, 2024
654ccc4
- Update loki local to use local stack storage
elipe17 Sep 24, 2024
a43da06
- log level info
elipe17 Sep 24, 2024
25487cc
- get promtail logs to file
elipe17 Sep 24, 2024
2428d41
- Move promtail process into gunicorn script
elipe17 Sep 25, 2024
e949849
- Update job label to be templated
elipe17 Sep 25, 2024
f38cf87
- Add space switching to allow for correct networking
elipe17 Sep 25, 2024
88c6d4e
- Update dashboards
elipe17 Sep 25, 2024
69ee469
- export missing DB metrics
elipe17 Sep 25, 2024
3a05246
- fix dashboard for local use
elipe17 Sep 25, 2024
0dfb56c
- correct name
elipe17 Sep 25, 2024
3fe36bc
- Update to use datasource uid
elipe17 Sep 25, 2024
d519ca2
- fix name
elipe17 Sep 25, 2024
d40b2c4
- Move log file to /tmp
elipe17 Sep 25, 2024
8b310a9
- make deployments rolling
elipe17 Sep 26, 2024
0578058
- update terraform
elipe17 Sep 26, 2024
9488843
- re-enable testing
elipe17 Sep 26, 2024
9eeee32
Merge branch 'develop' into 3046-plg-cloud
elipe17 Sep 26, 2024
0d242d7
- Remove debug stuff
elipe17 Sep 26, 2024
9d1604d
Change scrape to happen every 15s
elipe17 Sep 26, 2024
1afb129
Merge branch 'develop' into 3046-plg-cloud
elipe17 Sep 30, 2024
e1696da
Merge branch 'develop' into 3046-plg-cloud
elipe17 Sep 30, 2024
84999f1
- extra tests. mroe to be added
elipe17 Oct 2, 2024
31bdfe2
Merge branch '3046-plg-cloud' of https://github.com/raft-tech/TANF-ap…
elipe17 Oct 2, 2024
2895f9c
Merge branch 'develop' into 3046-plg-cloud
elipe17 Oct 2, 2024
34798ae
Merge branch 'develop' into 3046-plg-cloud
elipe17 Oct 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitconfig
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@
allowed = .git/config:.*
allowed = .gitconfig:.*
allowed = .*DJANGO_SECRET_KEY=.*
allowed = ./tdrs-backend/plg/loki/manifest.yml:*
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -108,4 +108,7 @@ tfapply
cypress.env.json

# Patches
*.patch
*.patch

# Logs
*.log
25 changes: 25 additions & 0 deletions scripts/deploy-backend.sh
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,27 @@ update_kibana()
cf run-task $CGAPPNAME_BACKEND --command "$CMD" --name kibana-obj-upload
}

prepare_promtail() {
pushd tdrs-backend/plg/promtail
CONFIG=config.yml
yq eval -i ".scrape_configs[0].job_name = \"system-$backend_app_name\"" $CONFIG
yq eval -i ".scrape_configs[0].static_configs[0].labels.job = \"system-$backend_app_name\"" $CONFIG
yq eval -i ".scrape_configs[1].job_name = \"backend-$backend_app_name\"" $CONFIG
yq eval -i ".scrape_configs[1].static_configs[0].labels.job = \"backend-$backend_app_name\"" $CONFIG
popd
}

update_plg_networking() {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want PLG to live in prod? If so, need to update networking.

# Need to switch the space after deploy since we're not always in dev space to handle specific networking from dev
# PLG apps to the correct backend app.
cf target -o hhs-acf-ofa -s tanf-dev
cf add-network-policy prometheus "$CGAPPNAME_BACKEND" -s "$CF_SPACE" --protocol tcp --port 8080
cf target -o hhs-acf-ofa -s "$CF_SPACE"

# Promtial needs to send logs to Loki
cf add-network-policy "$CGAPPNAME_BACKEND" loki -s "tanf-dev" --protocol tcp --port 8080
}

update_backend()
{
cd tdrs-backend || exit
Expand Down Expand Up @@ -143,6 +164,9 @@ update_backend()
# Add network policy to allow frontend to access backend
cf add-network-policy "$CGAPPNAME_FRONTEND" "$CGAPPNAME_BACKEND" --protocol tcp --port 8080

# Add PLG routing
update_plg_networking

if [ "$CF_SPACE" = "tanf-prod" ]; then
# Add network policy to allow backend to access tanf-prod services
cf add-network-policy "$CGAPPNAME_BACKEND" clamav-rest --protocol tcp --port 9000
Expand Down Expand Up @@ -229,6 +253,7 @@ else
CYPRESS_TOKEN=$CYPRESS_TOKEN
fi

prepare_promtail
if [ "$DEPLOY_STRATEGY" = "rolling" ] ; then
# Perform a rolling update for the backend and frontend deployments if
# specified, otherwise perform a normal deployment
Expand Down
5 changes: 3 additions & 2 deletions scripts/deploy-frontend.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ CF_SPACE=${5}
ENVIRONMENT=${6}

env=${CF_SPACE#"tanf-"}
frontend_app_name=$(echo $CGHOSTNAME_FRONTEND | cut -d"-" -f3)

# Update the Kibana name to include the environment
KIBANA_BASE_URL="${CGAPPNAME_KIBANA}-${env}.apps.internal"
Expand Down Expand Up @@ -52,7 +53,7 @@ update_frontend()

cf set-env "$CGHOSTNAME_FRONTEND" BACKEND_HOST "$CGHOSTNAME_BACKEND"
cf set-env "$CGHOSTNAME_FRONTEND" KIBANA_BASE_URL "$KIBANA_BASE_URL"

npm run build:$ENVIRONMENT
unlink .env.production
mkdir deployment
Expand Down Expand Up @@ -86,7 +87,7 @@ update_frontend()
else
cf map-route "$CGHOSTNAME_FRONTEND" app.cloud.gov --hostname "${CGHOSTNAME_FRONTEND}"
fi

cd ../..
rm -r tdrs-frontend/deployment
}
Expand Down
4 changes: 4 additions & 0 deletions scripts/localstack-setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,7 @@ awslocal s3api create-bucket --bucket $AWS_BUCKET --region $AWS_REGION_NAME

# Enable object versioning on the bucket
awslocal s3api put-bucket-versioning --bucket $AWS_BUCKET --versioning-configuration Status=Enabled

# Add bucket for Loki to store logs
awslocal s3api create-bucket --bucket loki-logs --region $AWS_REGION_NAME
awslocal s3api put-bucket-versioning --bucket loki-logs --versioning-configuration Status=Enabled
26 changes: 16 additions & 10 deletions tdrs-backend/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,21 +76,25 @@ services:
image: grafana/grafana:11.2.0
ports:
- 9400:9400
environment:
- GF_PATHS_PROVISIONING=/usr/share/grafana/conf/provisioning
- GF_SERVER_HTTP_PORT=9400
volumes:
- ./plg/grafana/datasources.yml:/etc/grafana/provisioning/datasources/default.yml
- ./plg/grafana/dashboards/provider.yml:/etc/grafana/provisioning/dashboards/default.yml
- ./plg/grafana/dashboards:/var/lib/grafana/provisioning/dashboards
- ./plg/grafana/custom.ini:/etc/grafana/grafana.ini
- ./plg/grafana/datasources.local.yml:/usr/share/grafana/conf/provisioning/datasources/datasources.yml
- ./plg/grafana/providers.local.yml:/usr/share/grafana/conf/provisioning/dashboards/providers.yml
- ./plg/grafana/dashboards:/var/lib/grafana/dashboards
- ./plg/grafana/custom.local.ini:/usr/share/grafana/conf/custom.ini
- grafana_data:/var/lib/grafana
command: --config /usr/share/grafana/conf/custom.ini

prometheus:
restart: always
image: prom/prometheus:v2.54.1
ports:
- 9090:9090
volumes:
- ./plg/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
- ./plg/prometheus/django_rules.yml:/etc/prometheus/prom_django_rules.yml
- ./plg/prometheus/prometheus.local.yml:/etc/prometheus/prometheus.yml
- ./plg/prometheus/django_rules.yml:/etc/prometheus/django_rules.yml
- prometheus_data:/prometheus
depends_on:
- web
Expand All @@ -99,22 +103,24 @@ services:

promtail:
restart: always
image: grafana/promtail:3.0.1
image: grafana/promtail:3.1.1
ports:
- 9080:9080
volumes:
- ./plg/promtail/config.yml:/etc/promtail/config.yml
- ./plg/promtail/config.local.yml:/etc/promtail/config.yml
- ~/tdp-logs/nginx:/var/log/nginx
- logs:/logs
command: -config.file=/etc/promtail/config.yml

loki:
restart: always
image: grafana/loki:3.0.1
image: grafana/loki:3.1.1
ports:
- 3100:3100
volumes:
- loki_data:/loki
- ./plg/loki/loki.local.yml:/loki/loki.yml
command: -config.file=/loki/loki.yml

celery-exporter:
restart: always
Expand Down Expand Up @@ -178,7 +184,7 @@ services:
- ELASTICSEARCH_LOG_INDEX_SLOW_LEVEL
volumes:
- .:/tdpapp
- logs:/tdpapp
- logs:/tmp
image: tdp
build: .
command: >
Expand Down
7 changes: 7 additions & 0 deletions tdrs-backend/gunicorn_start.sh
Original file line number Diff line number Diff line change
Expand Up @@ -36,4 +36,11 @@ fi

gunicorn_cmd="gunicorn $gunicorn_params"

if [[ $1 == "cloud" ]]; then
echo "Starting Promtail"
wget https://github.com/grafana/loki/releases/download/v3.1.1/promtail-linux-amd64.zip
unzip -a promtail-linux-amd64.zip && rm -rf promtail-linux-amd64.zip
./promtail-linux-amd64 -config.file=./plg/promtail/config.yml &
fi

exec $gunicorn_cmd
5 changes: 3 additions & 2 deletions tdrs-backend/manifest.buildpack.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,11 @@ applications:
- name: tdp-backend
memory: 2G
instances: 1
disk_quota: 2G
disk_quota: 4G
command: "./gunicorn_start.sh cloud"
env:
REDIS_URI: redis://localhost:6379
buildpacks:
- https://github.com/cloudfoundry/apt-buildpack
- https://github.com/cloudfoundry/python-buildpack.git#v1.8.3
command: "./gunicorn_start.sh"
- https://github.com/cloudfoundry/binary-buildpack
113 changes: 113 additions & 0 deletions tdrs-backend/plg/deploy.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
#!/bin/bash
set -e

help() {
echo "Deploy the PLG stack or a Postgres exporter to the Cloud Foundry space you're currently authenticated in."
echo "Syntax: deploy.sh [-h|a|p|u|d]"
echo "Options:"
echo "h Print this help message."
echo "a Deploy the entire PLG stack."
echo "p Deploy a postgres exporter. Requires -u and -d"
echo "u Requires -p. The database URI the exporter should connect with."
echo "d Requires -p. The Cloud Foundry service name of the RDS instance."
echo
}

deploy_pg_exporter() {
pushd postgres-exporter
MANIFEST=manifest.$1.yml
cp manifest.yml $MANIFEST

APP_NAME="pg-exporter-$1"

yq eval -i ".applications[0].name = \"$APP_NAME\"" $MANIFEST
yq eval -i ".applications[0].env.DATA_SOURCE_NAME = \"$2\"" $MANIFEST
yq eval -i ".applications[0].services[0] = \"$3\"" $MANIFEST

cf push --no-route -f $MANIFEST -t 180 --strategy rolling
cf map-route $APP_NAME apps.internal --hostname $APP_NAME

# Add policy to allow prometheus to talk to pg-exporter
# TODO: this logic needs to be updated to allow routing accross spaces based on where we want PLG to live.
cf add-network-policy prometheus $APP_NAME -s "tanf-dev" --protocol tcp --port 9187
rm $MANIFEST
popd
}

deploy_grafana() {
pushd grafana
APP_NAME="grafana"
DATASOURCES="datasources.yml"
cp datasources.template.yml $DATASOURCES

yq eval -i ".datasources[0].url = \"http://prometheus.apps.internal:8080\"" $DATASOURCES
yq eval -i ".datasources[1].url = \"http://loki.apps.internal:8080\"" $DATASOURCES

cf push --no-route -f manifest.yml -t 180 --strategy rolling
# cf map-route $APP_NAME apps.internal --hostname $APP_NAME
# Give Grafana a public route for now. Might be able to swap to internal route later.
cf map-route "$APP_NAME" app.cloud.gov --hostname "${APP_NAME}"

# Add policy to allow grafana to talk to prometheus and loki
cf add-network-policy $APP_NAME prometheus --protocol tcp --port 8080
cf add-network-policy $APP_NAME loki --protocol tcp --port 8080
rm $DATASOURCES
popd
}

deploy_prometheus() {
pushd prometheus
cf push --no-route -f manifest.yml -t 180 --strategy rolling
cf map-route prometheus apps.internal --hostname prometheus
popd
}

deploy_loki() {
pushd loki
cf push --no-route -f manifest.yml -t 180 --strategy rolling
cf map-route loki apps.internal --hostname loki
popd
}

while getopts ":hap:u:d:" option; do
case $option in
h) # display Help
help
exit;;
a) # Deploy PLG stack
DEPLOY="plg";;
p) # Deploy a Postgres exporter to $ENV
ENV=$OPTARG
DEPLOY="pg-exporter";;
u) # Bind a Postgres exporter to $DB_URI
DB_URI=$OPTARG;;
d) # Bind a Postgres exporter to $DB_SERVICE_NAME
DB_SERVICE_NAME=$OPTARG;;
\?) # Invalid option
echo "Error: Invalid option"
exit;;
esac
done

if [ "$#" -eq 0 ]; then
help
exit
fi

pushd "$(dirname "$0")"
if [ "$DEPLOY" == "plg" ]; then
deploy_prometheus
deploy_loki
deploy_grafana
fi
if [ "$DEPLOY" == "pg-exporter" ]; then
if [ "$DB_URI" == "" ] || [ "$DB_SERVICE_NAME" == "" ]; then
echo "Error: you must also pass -u and -d when deploying a postgres exporter."
echo
help
popd
exit
fi
deploy_pg_exporter $ENV $DB_URI $DB_SERVICE_NAME
fi
popd
21 changes: 10 additions & 11 deletions tdrs-backend/plg/grafana/custom.ini
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
# TODO: Update the server config based on where we want PLG to live and how we specify the domain.

##################### Grafana Configuration Defaults #####################
#
# Do not modify this file in grafana installs
#

# possible values : production, development
app_mode = production

# instance name, defaults to HOSTNAME environment variable value or hostname if HOSTNAME var is empty
instance_name = ${HOSTNAME}
instance_name = grafana

#################################### Paths ###############################
[paths]
Expand Down Expand Up @@ -38,17 +37,17 @@ min_tls_version = ""
http_addr =

# The http port to use
http_port = 9400
http_port = 8080

# The public facing domain name used to access grafana from a browser
domain = localhost
domain = app.cloud.gov

# Redirect to correct domain if host header does not match domain
# Prevents DNS rebinding attacks
enforce_domain = false

# The full public facing url
root_url = %(protocol)s://%(domain)s:%(http_port)s/grafana/
root_url = %(protocol)s://%(domain)s:%(http_port)s/grafana

# Serve Grafana from subpath specified in `root_url` setting. By default it is set to `false` for compatibility reasons.
serve_from_sub_path = true
Expand Down Expand Up @@ -421,7 +420,7 @@ data_keys_cache_cleanup_interval = 1m
enabled = true

# snapshot sharing options
external_enabled = true
external_enabled = false
external_snapshot_url = https://snapshots.raintank.io
external_snapshot_name = Publish to snapshots.raintank.io

Expand Down Expand Up @@ -851,7 +850,7 @@ enabled = true
# 3. Composed by at least 1 lowercase character
# 4. Composed by at least 1 digit character
# 5. Composed by at least 1 symbol character
password_policy = false
password_policy = true

#################################### Auth Proxy ##########################
[auth.proxy]
Expand Down Expand Up @@ -1520,7 +1519,7 @@ enabled = true
#################################### News #############################
[news]
# Enable the news feed section
news_feed_enabled = true
news_feed_enabled = false

#################################### Query #############################
[query]
Expand Down Expand Up @@ -1938,7 +1937,7 @@ read_only_toggles =
#################################### Public Dashboards #####################################
[public_dashboards]
# Set to false to disable public dashboards
enabled = true
enabled = false

###################################### Cloud Migration ######################################
[cloud_migration]
Expand Down
Loading
Loading