Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm Chart Enhancements: Cluster Mode Deployment, Adding, Resharding, and Deleting Clusters #534

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open
132 changes: 132 additions & 0 deletions charts/garnet/templates/add-nodes-job.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
{{- if .Values.cluster.enabled }}
{{- if .Values.cluster.initJob.enabled }}
apiVersion: batch/v1
kind: Job
metadata:
name: {{ include "garnet.fullname" . }}-cluster-add
labels:
{{- include "garnet.labels" . | nindent 4 }}
annotations:
"helm.sh/hook": post-upgrade
spec:
backoffLimit: {{ .Values.cluster.initJob.backoffLimit }}
activeDeadlineSeconds: 1800
ttlSecondsAfterFinished: 600
template:
spec:
restartPolicy: Never
containers:
- name: add-node
image: "{{ .Values.cluster.initJob.image.registry }}/{{ .Values.cluster.initJob.image.repository }}:{{ .Values.cluster.initJob.image.tag | default "latest" }}"
command: ["/bin/sh", "-c"]
args:
- |
garnet_host="{{ include "garnet.fullname" . }}-0.{{ include "garnet.fullname" . }}-headless.{{ .Release.Namespace }}.svc.cluster.local"
Copy link

@Xizt Xizt Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when pod-0 is down or unknown state ?
With reshard-delete-job.yaml, if pod0 restarts how will it be able to re-shard

garnet_port="{{ .Values.containers.port }}"

echo "Starting node addition process..."

get_current_nodes() {
/usr/local/bin/redis-cli -h "$garnet_host" -p "$garnet_port" CLUSTER NODES | grep master | awk -F ',' '{print $2}' | awk '{print $1}'
}

check_node_ready() {
local node=$1
local retries=5
while [ $retries -gt 0 ]; do
if /usr/local/bin/redis-cli -h "$node" -p "$garnet_port" ping | grep -q "PONG"; then
return 0
else
echo "Node $node is not ready, retrying..."
retries=$((retries - 1))
sleep 5
fi
done
return 1
}

add_node_to_cluster() {
local node=$1
local retries=5
while [ $retries -gt 0 ]; do
/usr/local/bin/redis-cli --cluster add-node "$node:$garnet_port" "$garnet_host:$garnet_port"
if [ $? -eq 0 ]; then
echo "Successfully added node $node to the cluster"
return 0
else
echo "Failed to add node $node, retrying..."
retries=$((retries - 1))
sleep 5
fi
done
return 1
}


ensure_cluster_consistency() {
local consistent=false
while [ "$consistent" = false ]; do
output=$(/usr/local/bin/redis-cli --cluster check "$garnet_host:$garnet_port" 2>&1)
echo "$output"
if echo "$output" | grep -q "All nodes agree about slots configuration"; then
consistent=true
else
echo "Waiting for cluster consistency..."
sleep 10
fi
done
}

rebalance_cluster() {
local rebalanced=false
local attempts=0
local max_attempts=5
while [ "$rebalanced" = false ] && [ $attempts -lt $max_attempts ]; do
output=$(/usr/local/bin/redis-cli --cluster rebalance --cluster-use-empty-masters --cluster-yes "$garnet_host:$garnet_port" 2>&1)
echo "$output"
if echo "$output" | grep -q "ERR I don't know about node"; then
echo "Rebalancing encountered an error, retrying..."
sleep 10
attempts=$((attempts + 1))
else
rebalanced=true
fi
done

if [ "$rebalanced" = false ]; then
echo "Failed to rebalance the cluster after $max_attempts attempts."
exit 1
fi
}

get_desired_nodes() {
for i in $(seq 0 $(({{ .Values.cluster.statefulSet.replicas }} - 1))); do
printf "%s-%d.%s-headless.%s.svc.cluster.local " "{{ include "garnet.fullname" . }}" "$i" "{{ include "garnet.fullname" . }}" "{{ .Release.Namespace }}"
done
}

current_nodes=$(get_current_nodes)
desired_nodes=$(get_desired_nodes)

for node in $desired_nodes; do
if ! echo "$current_nodes" | grep -q "$node"; then
echo "Checking readiness of node $node"
if check_node_ready "$node"; then
echo "Node $node is ready, adding to the cluster"
add_node_to_cluster "$node"
else
echo "Failed to add node $node after multiple attempts"
exit 1
fi
fi
done

echo "Ensuring cluster consistency..."
ensure_cluster_consistency

echo "Rebalancing the cluster..."
rebalance_cluster
echo "Cluster rebalancing completed."

{{- end }}
{{- end }}
13 changes: 13 additions & 0 deletions charts/garnet/templates/cmd-pod.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
apiVersion: v1
kind: Pod
metadata:
name: cmd
namespace: garnet
labels:
app: cmd
spec:
containers:
- name: cmd
image: redis:latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider allowing to override this - similar to the init-job configuration ("{{ .Values.cluster.initJob.image.registry }}/{{ .Values.cluster.initJob.image.repository }}:{{ .Values.cluster.initJob.image.tag | default "latest" }}")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How was this addressed, I don't see a new commit. Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this say redis-latest? We have nothing to do with that image.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that pod is used for testing. So, i'll remove it from the pr.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also to use the redis-cli, don't we have to use the redis image?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't redis-tools sufficient for this?

command: ["/bin/sh"]
args: ["-c", "sleep 3600"]
54 changes: 54 additions & 0 deletions charts/garnet/templates/init-job.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
{{- if .Values.cluster.enabled }}
{{- if .Values.cluster.initJob.enabled }}
apiVersion: batch/v1
kind: Job
metadata:
name: "{{ .Release.Name }}-manage-cluster"
labels:
{{- include "garnet.labels" . | nindent 4 }}
{{- with .Values.cluster.initJob.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
backoffLimit: {{ .Values.cluster.initJob.backoffLimit }}
ttlSecondsAfterFinished: {{ .Values.cluster.initJob.ttlSecondsAfterFinished }}
template:
metadata:
labels:
{{- include "garnet.labels" . | nindent 8 }}
spec:
containers:
- name: init
image: "{{ .Values.cluster.initJob.image.registry }}/{{ .Values.cluster.initJob.image.repository }}:{{ .Values.cluster.initJob.image.tag | default "latest" }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
command:
- /bin/sh
- -c
- |
echo "Waiting for DNS propagation..."
sleep 10

# Wait for Redis to be ready using ping
njnicko marked this conversation as resolved.
Show resolved Hide resolved
echo "Waiting for redis to be ready..."
until /usr/local/bin/redis-cli -h {{ include "garnet.fullname" . }}-{{ sub (int .Values.cluster.statefulSet.replicas) 1 }}.{{ include "garnet.fullname" . }}-headless.{{ .Release.Namespace }}.svc.cluster.local -p {{ .Values.containers.port }} ping; do
echo "Waiting for redis to be ready..."
sleep 10
done

# Check how many clusters are ok
cluster_slots_ok=$(/usr/local/bin/redis-cli -h {{ include "garnet.fullname" . }}-0.{{ include "garnet.fullname" . }}-headless.{{ .Release.Namespace }}.svc.cluster.local -p {{ .Values.containers.port }} CLUSTER INFO 2>/dev/null | grep cluster_slots_ok | awk -F ':' '{print $2}')

# Create clusters if not created
if [ "$cluster_slots_ok" -ne 16384 ]; then
echo "Cluster is not fully created. Creating cluster..."
/usr/local/bin/redis-cli --cluster create {{- range $i := until (int .Values.cluster.statefulSet.replicas) }} {{ printf "%s-%d.%s-headless.%s.svc.cluster.local:%d " (include "garnet.fullname" $) $i (include "garnet.fullname" $) $.Release.Namespace (int $.Values.containers.port) }} {{- end }} --cluster-yes
else
echo "Cluster is already created and all slots are covered."
fi

# Additional wait time to ensure stability
sleep 10
restartPolicy: Never
{{- end }}
{{- end }}
122 changes: 122 additions & 0 deletions charts/garnet/templates/reshard-delete-job.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
apiVersion: batch/v1
kind: Job
metadata:
name: {{ include "garnet.fullname" . }}-reshard
labels:
{{- include "garnet.labels" . | nindent 4 }}
annotations:
"helm.sh/hook": pre-upgrade
"helm.sh/hook-weight": "0"
spec:
backoffLimit: {{ .Values.cluster.initJob.backoffLimit }}
activeDeadlineSeconds: 1800
ttlSecondsAfterFinished: 600
template:
spec:
containers:
- name: delete-node
image: "{{ .Values.cluster.initJob.image.registry }}/{{ .Values.cluster.initJob.image.repository }}:{{ .Values.cluster.initJob.image.tag | default "latest" }}"
command: ["/bin/sh", "-c"]
args:
- |
garnet_host="{{ include "garnet.fullname" . }}-0.{{ include "garnet.fullname" . }}-headless.{{ .Release.Namespace }}.svc.cluster.local"
Copy link

@Xizt Xizt Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This contains quite some logic for resharding. Is there a way we can unit test this 🤔

garnet_port="{{ .Values.containers.port }}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

garnet host, garnet port being used across multiple files. Wondering if this can be centrally managed along with other configurations like retries, sleep intervals..


LOG_FILE="/tmp/reshard.log"

echo "Starting resharding process..." | tee -a $LOG_FILE

get_current_nodes() {
/usr/local/bin/redis-cli -h "$garnet_host" -p "$garnet_port" CLUSTER NODES | grep master | awk -F ',' '{print $2}' | awk '{print $1}'
}

get_desired_nodes() {
for i in $(seq 0 $(({{ .Values.cluster.statefulSet.replicas }} - 1))); do
printf "%s-%d.%s-headless.%s.svc.cluster.local " "{{ include "garnet.fullname" . }}" "$i" "{{ include "garnet.fullname" . }}" "{{ .Release.Namespace }}"
done
}

get_node_id() {
local node=$1
/usr/local/bin/redis-cli -h "$node" -p "$garnet_port" CLUSTER NODES | grep myself | awk '{print $1}'
}

get_slots() {
local node_id=$1
/usr/local/bin/redis-cli -h "$garnet_host" -p "$garnet_port" CLUSTER NODES | grep "$node_id" | awk '{for(i=9;i<=NF;i++) { if ($i ~ /^[0-9]+$/ || $i ~ /^[0-9]+-[0-9]+$/) print $i }}'
}

reshard_slots() {
local from_node=$1
local to_node=$2
local slots=$3
/usr/local/bin/redis-cli --cluster reshard "$garnet_host:$garnet_port" --cluster-from "$from_node" --cluster-to "$to_node" --cluster-slots "$slots" --cluster-yes >> $LOG_FILE 2>&1
sleep 5
}

delete_node() {
local node_id=$1
/usr/local/bin/redis-cli --cluster del-node "$garnet_host:$garnet_port" "$node_id" >> $LOG_FILE 2>&1
}

current_nodes=$(get_current_nodes)
desired_nodes=$(get_desired_nodes)
echo "Desired nodes: $desired_nodes" | tee -a $LOG_FILE

nodes_to_remove=""
for node in $current_nodes; do
if ! echo "$desired_nodes" | grep -q "$node"; then
nodes_to_remove="$nodes_to_remove $node"
fi
done
echo "Nodes to remove: $nodes_to_remove" | tee -a $LOG_FILE

total_nodes_to_share_to="{{ .Values.cluster.statefulSet.replicas }}"

for node in $nodes_to_remove; do
node_id=$(get_node_id "$node")
slots=$(get_slots "$node_id")

echo "Slots: $slots" | tee -a $LOG_FILE

num_slots=0
for range in $slots; do
start_slot=$(echo $range | cut -d'-' -f1)
end_slot=$(echo $range | cut -d'-' -f2)
if [ -z "$end_slot" ]; then
end_slot=$start_slot
fi
num_slots=$((num_slots + end_slot - start_slot + 1))
echo "Range: $range, Start Slot: $start_slot, End Slot: $end_slot, Accumulated Slots: $num_slots" | tee -a $LOG_FILE
done
echo "Node $node_id manages $num_slots slots" | tee -a $LOG_FILE

if [ $num_slots -gt 0 ]; then
slots_to_be_distributed=$((num_slots / total_nodes_to_share_to))
remainder=$((num_slots % total_nodes_to_share_to))

for target_node in $desired_nodes; do
target_node_id=$(get_node_id "$target_node")
echo "Resharding $slots_to_be_distributed slots from $node_id to $target_node_id" | tee -a $LOG_FILE
reshard_slots "$node_id" "$target_node_id" "$slots_to_be_distributed"
done

while [ $remainder -ne 0 ]; do
for target_node in $desired_nodes; do
if [ $remainder -eq 0 ]; then
break
fi
target_node_id=$(get_node_id "$target_node")
echo "Resharding 1 slot from $node_id to $target_node_id" | tee -a $LOG_FILE
reshard_slots "$node_id" "$target_node_id" 1
remainder=$((remainder - 1))
done
done
fi
echo "Deleting node $node_id" | tee -a $LOG_FILE
delete_node "$node_id"
done
echo "Resharding process completed." | tee -a $LOG_FILE

restartPolicy: Never
backoffLimit: 1
Loading