Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: on low wasm memory hook #286

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from
Draft

Conversation

ielashi
Copy link
Contributor

@ielashi ielashi commented Mar 22, 2024

This proposal is based on this forum post and has already been approved by motion proposal 106146.

Canister developers have to actively monitor their canisters to know if they are low on wasm memory. If detected too late, a canister can be completely stuck and forever un-upgradable.

To address this, we introduce a hook called on_low_wasm_memory. When defined, it is triggered whenever the canister's memory usage exceeds some developer-defined threshold.

Copy link

github-actions bot commented Mar 22, 2024

🤖 Here's your preview: https://oov64-qqaaa-aaaak-qcnsa-cai.icp0.io/docs

@ielashi
Copy link
Contributor Author

ielashi commented Mar 22, 2024

@mraszyk I didn't make any changes to the formal model in this PR yet, as I first wanted to get your feedback that the proposal as-is looks good.

spec/index.md Show resolved Hide resolved
@dsarlis
Copy link
Member

dsarlis commented May 13, 2024

@mraszyk I didn't make any changes to the formal model in this PR yet, as I first wanted to get your feedback that the proposal as-is looks good.

@ielashi @mraszyk I assume that we're fine with the current status of this PR so can the formal model be updated so we can move this to "FINAL" and be ready to start implementation at our convenience?

@ielashi
Copy link
Contributor Author

ielashi commented May 13, 2024

@mraszyk Is the formal model something that you'd update yourself or should we look into that on our side?

@mraszyk

This comment was marked as resolved.

@ielashi

This comment was marked as resolved.

@mraszyk

This comment was marked as resolved.

@ielashi

This comment was marked as resolved.

@mraszyk
Copy link
Contributor

mraszyk commented May 15, 2024

We discussed this PR in yesterday's spec meeting together with @dsarlis and we concluded that it'd make more sense to guarantee that enough cycles are reserved for the hook and that the hook runs as the first message after the wasm memory limit threshold is crossed and then it's fine to treat it as a system task (rather than, e.g., call_on_cleanup).

spec/index.md Outdated Show resolved Hide resolved
spec/index.md Outdated Show resolved Hide resolved
spec/index.md Outdated Show resolved Hide resolved
spec/index.md Outdated Show resolved Hide resolved
@dragoljub-duric
Copy link
Contributor

dragoljub-duric commented Jul 2, 2024

@mraszyk @ielashi @dsarlis
Question 1:
WDYT when this hook should be executed:

  1. In every round canister is scheduled - this approach seems non-optimal because when we do not have anything to execute there is no risk of lack of memory
  2. In every round canister executes a task or message - this approach is the safest since we will ensure that canister has enough memory before every execution, but we will have a bigger overhead when we execute hook with for example every heartbeat
  3. In every round canister executes a message - this approach seems like reasonable tradeoff, but I am not sure that decreasing overhead from previous point is worth the risk

Question 2:
If canister executes multiple messages (and/or tasks) in a single round should the hook be invoked before every execution?

@Dfinity-Bjoern
Copy link
Member

Dfinity-Bjoern commented Jul 2, 2024

As described above, we suggest keeping the old semantics (specifying a limit on allocated heap memory rather than "remaining" memory), in which case the scheduling is also much clearer (just execute once when an execution of the canister exceeds the limit, immediately after the execution that causes the hook to run).

@@ -1485,7 +1501,7 @@ The comment after each function lists from where these functions may be invoked:

- `F`: from `canister_inspect_message`

- `T`: from *system task* (`canister_heartbeat` or `canister_global_timer`)
- `T`: from *system task* (`canister_heartbeat` or `canister_global_timer` or `canister_on_low_wasm_memory`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we make this of type T, then a canister could update the threshold in a setting and allocate a new page, which should make the hook trigger again. So if we guarantee immediate execution, one can build a chain of calls to the hook. So we need to make sure that immediate may mean that it executes as part of the next execution round (just that nothing else runs in between).

@dragoljub-duric
Copy link
Contributor

dragoljub-duric commented Jul 30, 2024

@Dfinity-Bjoern @mraszyk If we start keeping information about the hook as {"Condition is not satisfied", "Executed", "Ready to be executed"}, and based on that we use to determine if we will run hook, do you think we should persist that information in snapshots?
Screenshot 2024-07-30 at 15 15 20

github-merge-queue bot pushed a commit to dfinity/ic that referenced this pull request Sep 24, 2024
This PR is part of the OnLowWasmMemoryHook effort. Here we implement
checking for hook condition and propagating hook status to SystemState.
Condition for executing hook, as agreed in interface spec
[discussion](dfinity/interface-spec#286 (comment))
is:

1. In the case with memory_allocation `min(memory_allocation -
used_stable_memory, wasm_memory_limit) - used_wasm_memory`
2. Without memory allocation `wasm_memory_limit - used_wasm_memory` 

Hook status can be one of:

1. Condition is not satisfied
3. Ready for execution
4. Executed

The hook condition is checked whenever additional execution (stable or
Wasm) memory allocation is requested.

After this change, we will have all the required information of
SystemState and it will be necessary only to schedule hook execution in
the following round.

This PR will be followed with:

1. [EXC-1724](https://dfinity.atlassian.net/browse/EXC-1724) In which we
will refactor `SystemState::task_queue` to be new struct with additional
field `on_low_wasm_memory_hook_status`, to ensure that whenever
available first task that will be poped form task queue is
`OnLowWasmMemoryHook` (excluding paused executions). For that reason
`SystemState::on_low_wasm_memory_hook status` is private, and we use
`set` and `get`, so it can be easier encapsulated in the future
`TaskQueue` struct.
2. [EXC-1725](https://dfinity.atlassian.net/browse/EXC-1725) Schedule
and execute `OnLowWasmMemoryHook`


[EXC-1724]:
https://dfinity.atlassian.net/browse/EXC-1724?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[EXC-1725]:
https://dfinity.atlassian.net/browse/EXC-1725?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

---------

Co-authored-by: Alin Sinpalean <[email protected]>
frankdavid pushed a commit to dfinity/ic that referenced this pull request Sep 25, 2024
This PR is part of the OnLowWasmMemoryHook effort. Here we implement
checking for hook condition and propagating hook status to SystemState.
Condition for executing hook, as agreed in interface spec
[discussion](dfinity/interface-spec#286 (comment))
is:

1. In the case with memory_allocation `min(memory_allocation -
used_stable_memory, wasm_memory_limit) - used_wasm_memory`
2. Without memory allocation `wasm_memory_limit - used_wasm_memory` 

Hook status can be one of:

1. Condition is not satisfied
3. Ready for execution
4. Executed

The hook condition is checked whenever additional execution (stable or
Wasm) memory allocation is requested.

After this change, we will have all the required information of
SystemState and it will be necessary only to schedule hook execution in
the following round.

This PR will be followed with:

1. [EXC-1724](https://dfinity.atlassian.net/browse/EXC-1724) In which we
will refactor `SystemState::task_queue` to be new struct with additional
field `on_low_wasm_memory_hook_status`, to ensure that whenever
available first task that will be poped form task queue is
`OnLowWasmMemoryHook` (excluding paused executions). For that reason
`SystemState::on_low_wasm_memory_hook status` is private, and we use
`set` and `get`, so it can be easier encapsulated in the future
`TaskQueue` struct.
2. [EXC-1725](https://dfinity.atlassian.net/browse/EXC-1725) Schedule
and execute `OnLowWasmMemoryHook`


[EXC-1724]:
https://dfinity.atlassian.net/browse/EXC-1724?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[EXC-1725]:
https://dfinity.atlassian.net/browse/EXC-1725?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

---------

Co-authored-by: Alin Sinpalean <[email protected]>
levifeldman pushed a commit to levifeldman/ic that referenced this pull request Oct 1, 2024
…ity#667)

This PR is part of the OnLowWasmMemoryHook effort. Here we implement
checking for hook condition and propagating hook status to SystemState.
Condition for executing hook, as agreed in interface spec
[discussion](dfinity/interface-spec#286 (comment))
is:

1. In the case with memory_allocation `min(memory_allocation -
used_stable_memory, wasm_memory_limit) - used_wasm_memory`
2. Without memory allocation `wasm_memory_limit - used_wasm_memory` 

Hook status can be one of:

1. Condition is not satisfied
3. Ready for execution
4. Executed

The hook condition is checked whenever additional execution (stable or
Wasm) memory allocation is requested.

After this change, we will have all the required information of
SystemState and it will be necessary only to schedule hook execution in
the following round.

This PR will be followed with:

1. [EXC-1724](https://dfinity.atlassian.net/browse/EXC-1724) In which we
will refactor `SystemState::task_queue` to be new struct with additional
field `on_low_wasm_memory_hook_status`, to ensure that whenever
available first task that will be poped form task queue is
`OnLowWasmMemoryHook` (excluding paused executions). For that reason
`SystemState::on_low_wasm_memory_hook status` is private, and we use
`set` and `get`, so it can be easier encapsulated in the future
`TaskQueue` struct.
2. [EXC-1725](https://dfinity.atlassian.net/browse/EXC-1725) Schedule
and execute `OnLowWasmMemoryHook`


[EXC-1724]:
https://dfinity.atlassian.net/browse/EXC-1724?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
[EXC-1725]:
https://dfinity.atlassian.net/browse/EXC-1725?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

---------

Co-authored-by: Alin Sinpalean <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants