Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory controller panics in some situations #138

Open
blt opened this issue Jul 18, 2024 · 1 comment · May be fixed by #139
Open

Memory controller panics in some situations #138

blt opened this issue Jul 18, 2024 · 1 comment · May be fixed by #139
Labels
bug Something isn't working needs-review

Comments

@blt
Copy link

blt commented Jul 18, 2024

Describe the bug

In the lading project we're using cgroups-rs 0.3.4 to read memory and cpu controller data for processes in containers, managed with docker. We have discovered that unless --cgroupns host is set on the lading container lading will crash with a panic at this line. The problem does away if the host namespace is set, but it is unclear if the expected behavior is a panic or not from this crate.

Expected behavior

No panic, or a library function to determine if reads from the memory controller are valid or not prior to making a potentially panic'ing call.

Additional information

A backtrace from our project:

thread 'tokio-runtime-worker' panicked at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/cgroups-rs-0.3.4/src/memory.rs:587:34:
called `Result::unwrap()` on an `Err` value: Error { kind: ReadFailed("/sys/fs/cgroup/../docker-c73828602e1e6dc2514d805186a11769b03c72274355209b0b81885bf9da2cb4.scope/memory.high"), cause: Some(Os { code: 2, kind: NotFound, message: "No such file or directory" }) }
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::result::unwrap_failed
   3: cgroups_rs::memory::MemController::memory_stat
   4: lading::observer::linux::Sampler::sample::{{closure}}
   5: lading::observer::Server::run::{{closure}}
   6: tokio::runtime::task::raw::poll
   7: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
   8: tokio::runtime::task::raw::poll
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
@blt blt added bug Something isn't working needs-review labels Jul 18, 2024
@krlohnes
Copy link

This PR looks like it would have fixed things, but stalled out.

alexman-stripe added a commit to alexman-stripe/cgroups-rs that referenced this issue Sep 10, 2024
When containers are terminated, cgroup v2 memory metrics under /sys/fs/cgroup may disappear.
Previously, kata-agent assumed these metrics always exist, leading to panics as reported in kata-containers#138.

This commit returns default value (0) when memory metric files are missing.
This behaviour aligns with cgroup v1, which also defaults to 0 memory metric files are missing:
- Memory.limit_in_bytes which maps to m.max https://github.com/kata-containers/cgroups-rs/blob/main/src/memory.rs#L635
- Memory.soft_limit_in_bytes which maps to m.low https://github.com/kata-containers/cgroups-rs/blob/main/src/memory.rs#L661
- MemSwap.fail_cnt: https://github.com/kata-containers/cgroups-rs/blob/main/src/memory.rs#L631

Submitting a new PR because kata-containers#116 does not handle MemSwap.fail_cnt, which will also cause panic.
alexman-stripe added a commit to alexman-stripe/cgroups-rs that referenced this issue Sep 10, 2024
When containers are terminated, cgroup v2 memory metrics under /sys/fs/cgroup may disappear.
Previously, kata-agent assumed these metrics always exist, leading to panics as reported in kata-containers#138.

This commit returns default value (0) when memory metric files are missing.
This behaviour aligns with cgroup v1, which also defaults to 0 memory metric files are missing:
- Memory.limit_in_bytes which maps to m.max https://github.com/kata-containers/cgroups-rs/blob/main/src/memory.rs#L635
- Memory.soft_limit_in_bytes which maps to m.low https://github.com/kata-containers/cgroups-rs/blob/main/src/memory.rs#L661
- MemSwap.fail_cnt: https://github.com/kata-containers/cgroups-rs/blob/main/src/memory.rs#L631

Submitting a new PR because kata-containers#116 does not handle MemSwap.fail_cnt, which will also cause panic.
@alexman-stripe alexman-stripe linked a pull request Sep 10, 2024 that will close this issue
alexman-stripe added a commit to alexman-stripe/cgroups-rs that referenced this issue Sep 11, 2024
When containers are terminated, cgroup v2 memory metrics under /sys/fs/cgroup may disappear.
Previously, kata-agent assumed these metrics always exist, leading to panics as reported in kata-containers#138.

This commit returns default value (0) when memory metric files are missing.
This behaviour aligns with cgroup v1, which also defaults to 0 memory metric files are missing:
- Memory.limit_in_bytes which maps to m.max https://github.com/kata-containers/cgroups-rs/blob/main/src/memory.rs#L635
- Memory.soft_limit_in_bytes which maps to m.low https://github.com/kata-containers/cgroups-rs/blob/main/src/memory.rs#L661
- MemSwap.fail_cnt: https://github.com/kata-containers/cgroups-rs/blob/main/src/memory.rs#L631

Submitting a new PR because kata-containers#116 does not handle MemSwap.fail_cnt, which will also cause panic.

Signed-off-by: Alex Man <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-review
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants