You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have a Docker Swarm cluster running InfluxDB v2.7.8 on a single node, where we experience issues every Sunday evening at 00 AM, with a recurrence interval of 2 to 3 weeks between occurrences.
We’ve noticed that memory usage suddenly begins to increase steadily, and InfluxDB struggles to write data to disk. While some data is still being written, most of it is not. I’ve reviewed the logs, including Docker logs, syslog, and docker.service logs, but haven’t found any relevant information. There are no error entries suggesting that Influx is having some sort of problem. It is compacting and doing its normal tasks.
Our first attempt to resolve this issue was upgrading InfluxDB from version 2.7.5 to 2.7.8, as there was a changelog entry addressing an infinite write loop bug. Unfortunately, this didn’t resolve the issue, as we experienced the same problem again today.
We have Telegraf running on the server which gathers metrics from the Influx, where we've seen a spike in the queue active:
Below is also the graph from the memory usage of the server:
The problem is solved by restarting the docker container, but we have lost all data that Influx had in memory.
Environment info:
uname -srm:
Linux 5.15.0-113-generic x86_64
Docker version:
Docker version 24.0.7, build afdd53b
Influxdb-docker image version 2.7.8
The server is a VM running in VMware.
Logs:
The only log error I can find which might be relevant is that Telegraf failed to send metrics to our off-site influx at 00:00 AM and 00:15 AM, even though our off-site influx still received some data from the server.
The text was updated successfully, but these errors were encountered:
We have a Docker Swarm cluster running InfluxDB v2.7.8 on a single node, where we experience issues every Sunday evening at 00 AM, with a recurrence interval of 2 to 3 weeks between occurrences.
We’ve noticed that memory usage suddenly begins to increase steadily, and InfluxDB struggles to write data to disk. While some data is still being written, most of it is not. I’ve reviewed the logs, including Docker logs, syslog, and docker.service logs, but haven’t found any relevant information. There are no error entries suggesting that Influx is having some sort of problem. It is compacting and doing its normal tasks.
Our first attempt to resolve this issue was upgrading InfluxDB from version 2.7.5 to 2.7.8, as there was a changelog entry addressing an infinite write loop bug. Unfortunately, this didn’t resolve the issue, as we experienced the same problem again today.
We have Telegraf running on the server which gathers metrics from the Influx, where we've seen a spike in the queue active:
Below is also the graph from the memory usage of the server:
The problem is solved by restarting the docker container, but we have lost all data that Influx had in memory.
Environment info:
uname -srm:
Linux 5.15.0-113-generic x86_64
Docker version:
Docker version 24.0.7, build afdd53b
Influxdb-docker image version 2.7.8
The server is a VM running in VMware.
Logs:
The only log error I can find which might be relevant is that Telegraf failed to send metrics to our off-site influx at 00:00 AM and 00:15 AM, even though our off-site influx still received some data from the server.
The text was updated successfully, but these errors were encountered: