Skip to content

Commit

Permalink
Merge pull request #369 from zendesk/liveness-probe-avoids-loading-co…
Browse files Browse the repository at this point in the history
…nfig-files

Avoid loading config files when executing the liveness probe check
  • Loading branch information
deepredsky committed May 30, 2024
2 parents 5af1912 + 25d02ca commit 164921f
Show file tree
Hide file tree
Showing 5 changed files with 112 additions and 15 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

## Unreleased

* Allow the liveness probe command to skip loading config files

## 2.11.0.beta3

* Fix bug with domain socket support
Expand Down
30 changes: 21 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -542,25 +542,30 @@ While `maxSurge` should always be 0, `maxUnavailable` can be increased to reduce
Racecar comes with a built-in liveness probe, primarily for use with Kubernetes, but useful for any deployment environment where you can periodically run a process to check the health of your consumer.

To use this feature:
- set the `liveness_probe_enabled` config option to true.
- configure your Kubernetes deployment to run `$ racecarctl liveness_probe`
1. Set the `liveness_probe_enabled` config option to true.
2. Configure your Kubernetes deployment liveness probe to run this command `$ racecarctl liveness_probe`


When enabled (see config) Racecar will touch the file at `liveness_probe_file_path` each time it finishes polling Kafka and processing the messages in the batch (if any).
When enabled (see config) Racecar will touch the file at the specified path each time it finishes polling Kafka and processing the messages in the batch (if any).

The modified time of this file can be observed to determine when the consumer last exhibited 'liveness'.

Running `racecarctl liveness_probe` will return a successful exit status if the last 'liveness' event happened within an acceptable time, `liveness_probe_max_interval`.
Running `racecarctl liveness_probe` will return a successful exit status if the last 'liveness' event happened within an acceptable time, which you can set as `liveness_probe_max_interval`.

`liveness_probe_max_interval` should be long enough to account for both the Kafka polling time of `max_wait_time` and the processing time of a full message batch.

On receiving `SIGTERM`, Racecar will gracefully shut down and delete this file, causing the probe to fail immediately after exit.

You may wish to tolerate more than one failed probe run to accommodate for environmental variance and clock changes.

See the [Configuration section](https://github.com/zendesk/racecar#configuration) for the various ways the liveness probe can be configured, environment variables being one option.
The [Configuration section](https://github.com/zendesk/racecar#configuration) for the various ways the liveness probe can be configured. (We recommend environment variables).

##### Slow racecar.rb / racecar.yml? Skip config files!

If your config files need to do something expensive, such as load Rails, you can enable `RACECAR_LIVENESS_PROBE_SKIP_CONFIG_FILES`. The liveness probe command will then skip loading your configuration and execute quickly.

Here is an example Kubernetes liveness probe configuration:
Most other configuration values can be set via the environment, we recommend you do this for liveness probe settings.

##### Example Kubernetes Configuration

```yaml
apiVersion: apps/v1
Expand All @@ -576,17 +581,24 @@ spec:
- SomeConsumer
env:
# Skip config loading to run fast, only the following values are needed
- name: RACECAR_LIVENESS_PROBE_SKIP_CONFIG_FILES
value: "true"
- name: RACECAR_LIVENESS_PROBE_ENABLED
value: "true"
- name: RACECAR_LIVENESS_PROBE_FILE_PATH
value: "/tmp/racecar-liveness"
- name: RACECAR_LIVENESS_PROBE_MAX_INTERVAL
value: "5"
livenessProbe:
exec:
command:
- racecarctl
- liveness_probe
# Allow up to 10 consecutive failures before terminating Pod:
failureThreshold: 10
# Allow up to 3 consecutive failures before terminating Pod:
failureThreshold: 3
# Wait 30 seconds before starting the probes:
initialDelaySeconds: 30
Expand Down
3 changes: 3 additions & 0 deletions lib/racecar/config.rb
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,9 @@ class Config < KingKonf::Config
desc "Used only by the liveness probe: Max time (in seconds) between liveness events before the process is considered not healthy"
integer :liveness_probe_max_interval, default: 5

desc "Allows the liveness probe command to skip loading config files. When enabled, configure liveness probe values via environmental variables. Defaults still apply. Only applies to the liveness probe command."
boolean :liveness_probe_skip_config_files, default: false

desc "Strategy for switching topics when there are multiple subscriptions. `exhaust-topic` will only switch when the consumer poll returns no messages. `round-robin` will switch after each poll regardless.\nWarning: `round-robin` will be the default in Racecar 3.x"
string :multi_subscription_strategy, allowed_values: %w(round-robin exhaust-topic), default: "exhaust-topic"

Expand Down
12 changes: 7 additions & 5 deletions lib/racecar/ctl.rb
Original file line number Diff line number Diff line change
Expand Up @@ -36,12 +36,14 @@ def liveness_probe(args)
require "racecar/liveness_probe"
parse_options!(args)

if ENV["RAILS_ENV"] && File.exist?("config/racecar.yml")
Racecar.config.load_file("config/racecar.yml", ENV["RAILS_ENV"])
end
unless config.liveness_probe_skip_config_files
if File.exist?("config/racecar.rb")
require "./config/racecar"
end

if File.exist?("config/racecar.rb")
require "./config/racecar"
if ENV["RAILS_ENV"] && File.exist?("config/racecar.yml")
Racecar.config.load_file("config/racecar.yml", ENV["RAILS_ENV"])
end
end

Racecar.config.liveness_probe.check_liveness_within_interval!
Expand Down
80 changes: 79 additions & 1 deletion spec/integration/kubernetes_probes_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,82 @@
end
end

describe "config loading" do
let(:ruby_config_file) { File.expand_path("config/racecar.rb", tmp_dir) }
let(:yaml_config_file) { File.expand_path("config/racecar.yml", tmp_dir) }
let(:tmp_dir) { File.expand_path("/tmp/your_racecar_project", Dir.pwd) }
let(:ruby_config_indicator_file) { "ruby_config_was_loaded.truth" }
let(:yaml_config_indicator_file) { "yaml_config_was_loaded.truth" }
let(:ruby_config_file_contents) { <<~RUBY }
`touch ruby_config_was_loaded.truth`
RUBY
let(:yaml_config_file_contents) { <<~YAML }
production:
client_id: client_id
group_id: anything
<% `touch yaml_config_was_loaded.truth` %>
YAML

before do
@original_directory = Dir.pwd
FileUtils.mkdir_p(File.dirname(ruby_config_file))
File.write(ruby_config_file, ruby_config_file_contents)
File.write(yaml_config_file, yaml_config_file_contents)
Dir.chdir(tmp_dir)
end

after do
Dir.chdir(@original_directory)
FileUtils.rm_rf(tmp_dir)
end

context "when config file loading is disabled" do
let(:env_vars) do
{
"RACECAR_LIVENESS_PROBE_SKIP_CONFIG_FILES" => "true",
"RAILS_ENV" => "production",
}
end

it "does not load config/racecar.rb or config/racecar.yml" do
run_probe

aggregate_failures do
expect(Dir.glob("*")).not_to include(ruby_config_indicator_file)
expect(Dir.glob("*")).not_to include(yaml_config_indicator_file)
end
end
end

context "when config file loading is enabled " do
let(:env_vars) { {} }

it "loads config/racecar.rb" do
run_probe

expect(Dir.glob("*")).to include(ruby_config_indicator_file)
end

context "when RAILS_ENV is set" do
let(:env_vars) { { "RAILS_ENV" => "production" } }

it "loads config/racecar.yml" do
run_probe

expect(Dir.glob("*")).to include(yaml_config_indicator_file)
end
end

context "when RAILS_ENV is not set" do
it "does not load config/racecar.yml" do
run_probe

expect(Dir.glob("*")).not_to include(yaml_config_indicator_file)
end
end
end
end

let(:file_path) { "/tmp/racecar-liveness-file-#{SecureRandom.hex(4)}" }
let(:max_interval) { 1 }
let(:racecar_cli) { Racecar::Cli.new([consumer_class.name.to_s]) }
Expand All @@ -88,8 +164,10 @@
}
end

let!(:racecarctl) { File.expand_path("exe/racecarctl", Dir.pwd) }

def run_probe
command = "exe/racecarctl liveness_probe"
command = "#{racecarctl} liveness_probe"
output, status = Open3.capture2e(env_vars, command)
$stderr.puts "Probe output: #{output}" if ENV["DEBUG"]
status.success?
Expand Down

0 comments on commit 164921f

Please sign in to comment.