Skip to content

Commit

Permalink
Memory Reserve for Queries and Upgrades (#4158)
Browse files Browse the repository at this point in the history
# Memory Reserve for Queries and Upgrades


## Current Situation
With the incremental GC, programs can scale to the full 4GB memory space. As a consequence, update calls can allocate the full memory space, such that even a simple query, a simple composite query, or a simple upgrade logic can no longer succeed. This could also happen with classical non-copying GCs if deterministic time slicing is sufficiently extended.

## Provisional Solution 
This PR prevents update calls (including canister initialization, heartbeats, and timers) from allocating the full memory by leaving a reserve for queries, composite queries, and canister upgrades. During queries, composite queries, and canister upgrades, garbage collection is suspended, such that the reserve is available to the mutator code. Callbacks of composite queries can also use the memory reserve.

The current allocation limit for upgrade calls is 3.75 GB and applies to all GCs. This gives a reserve of 224 MB for the incremental GC as the last 32MB partition is unallocated in the current design. The scheduling heuristics of the incremental and generational GC needs to consider the reduced capacity for determining memory shortage.

This PR can be viewed as a temporary measure until a memory reserve is implemented in the IC runtime system.

## Important Note
The memory reserve may **not** be sufficient for a complex canister upgrade logic or large amount of stable heap data. Moreover, the upgrade logic may exceed the instruction limit. Thorough upgrade testing is required for canisters in any case.
  • Loading branch information
luc-blaeser authored Sep 15, 2023
1 parent c8b4125 commit 01e2f02
Show file tree
Hide file tree
Showing 26 changed files with 275 additions and 28 deletions.
2 changes: 1 addition & 1 deletion rts/motoko-rts/src/gc/generational.rs
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ static mut OLD_GENERATION_THRESHOLD: usize = 32 * 1024 * 1024;
static mut PASSED_CRITICAL_LIMIT: bool = false;

#[cfg(feature = "ic")]
const CRITICAL_MEMORY_LIMIT: usize = (4096 - 512) * 1024 * 1024;
const CRITICAL_MEMORY_LIMIT: usize = (4096 - 512) * 1024 * 1024 - crate::memory::MEMORY_RESERVE;

#[cfg(feature = "ic")]
unsafe fn decide_strategy(limits: &Limits) -> Option<Strategy> {
Expand Down
5 changes: 3 additions & 2 deletions rts/motoko-rts/src/gc/incremental.rs
Original file line number Diff line number Diff line change
Expand Up @@ -70,9 +70,10 @@ static mut LAST_ALLOCATIONS: Bytes<u64> = Bytes(0u64);
#[cfg(feature = "ic")]
unsafe fn should_start() -> bool {
use self::partitioned_heap::PARTITION_SIZE;
use crate::memory::ic::partitioned_memory;
use crate::memory::{ic::partitioned_memory, MEMORY_RESERVE};

const CRITICAL_HEAP_LIMIT: Bytes<u32> = Bytes(u32::MAX - 768 * 1024 * 1024);
const CRITICAL_HEAP_LIMIT: Bytes<u32> =
Bytes(u32::MAX - 768 * 1024 * 1024 - MEMORY_RESERVE as u32);
const CRITICAL_GROWTH_THRESHOLD: f64 = 0.01;
const NORMAL_GROWTH_THRESHOLD: f64 = 0.65;

Expand Down
5 changes: 5 additions & 0 deletions rts/motoko-rts/src/memory.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@ use crate::types::*;

use motoko_rts_macros::ic_mem_fn;

// Memory reserve in bytes ensured during update and initialization calls.
// For use by queries and upgrade calls.
#[cfg(feature = "ic")]
pub(crate) const MEMORY_RESERVE: usize = 256 * 1024 * 1024;

/// A trait for heap allocation. RTS functions allocate in heap via this trait.
///
/// To be able to link the RTS with moc-generated code, we implement wrappers around allocating
Expand Down
19 changes: 14 additions & 5 deletions rts/motoko-rts/src/memory/ic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ pub mod partitioned_memory;

use super::Memory;
use crate::constants::WASM_PAGE_SIZE;
use crate::memory::MEMORY_RESERVE;
use crate::rts_trap_with;
use crate::types::{Bytes, Value};
use core::arch::wasm32;
Expand All @@ -16,6 +17,7 @@ use motoko_rts_macros::*;
extern "C" {
fn get_heap_base() -> usize;
pub(crate) fn get_static_roots() -> Value;
fn keep_memory_reserve() -> bool;
}

pub(crate) unsafe fn get_aligned_heap_base() -> usize {
Expand All @@ -35,12 +37,19 @@ unsafe extern "C" fn get_max_live_size() -> Bytes<u32> {
/// `Memory` implementation allocates in Wasm heap with Wasm `memory.grow` instruction.
pub struct IcMemory;

/// Page allocation. Ensures that the memory up to, but excluding, the given pointer is allocated,
/// with the slight exception of not allocating the extra page for address 0xFFFF_0000.
/// Page allocation. Ensures that the memory up to, but excluding, the given pointer is allocated.
/// Ensure a memory reserve of at least one Wasm page depending on the canister state.
unsafe fn grow_memory(ptr: u64) {
debug_assert_eq!(0xFFFF_0000, usize::MAX - WASM_PAGE_SIZE.as_usize() + 1);
if ptr > 0xFFFF_0000 {
// spare the last wasm memory page
const LAST_PAGE_LIMIT: usize = 0xFFFF_0000;
debug_assert_eq!(LAST_PAGE_LIMIT, usize::MAX - WASM_PAGE_SIZE.as_usize() + 1);
let limit = if keep_memory_reserve() {
// Spare a memory reserve during update and initialization calls for use by queries and upgrades.
usize::MAX - MEMORY_RESERVE + 1
} else {
// Spare the last Wasm memory page on queries and upgrades to support the Rust call stack boundary checks.
LAST_PAGE_LIMIT
};
if ptr > limit as u64 {
rts_trap_with("Cannot grow memory")
};
let page_size = u64::from(WASM_PAGE_SIZE.as_u32());
Expand Down
48 changes: 42 additions & 6 deletions src/codegen/compile.ml
Original file line number Diff line number Diff line change
Expand Up @@ -4371,7 +4371,7 @@ module Lifecycle = struct
| InPreUpgrade -> [Idle]
| PostPreUpgrade -> [InPreUpgrade]
| InPostUpgrade -> [InInit]
| InComposite -> [Idle]
| InComposite -> [Idle; InComposite]

let get env =
compile_unboxed_const (ptr ()) ^^
Expand All @@ -4398,6 +4398,10 @@ module Lifecycle = struct
set env new_state
)

let is_in env state =
get env ^^
compile_eq_const (int_of_state state)

end (* Lifecycle *)


Expand Down Expand Up @@ -5542,6 +5546,22 @@ module RTS_Exports = struct
edesc = nr (FuncExport (nr rts_trap_fi))
});

(* Keep a memory reserve when in update or init state.
This reserve can be used by queries, composite queries, and upgrades. *)
let keep_memory_reserve_fi = E.add_fun env "keep_memory_reserve" (
Func.of_body env [] [I32Type] (fun env ->
Lifecycle.get env ^^
compile_eq_const Lifecycle.(int_of_state InUpdate) ^^
Lifecycle.get env ^^
compile_eq_const Lifecycle.(int_of_state InInit) ^^
G.i (Binary (Wasm.Values.I32 I32Op.Or))
)
) in
E.add_export env (nr {
name = Lib.Utf8.decode "keep_memory_reserve";
edesc = nr (FuncExport (nr keep_memory_reserve_fi))
});

if !Flags.gc_strategy <> Flags.Incremental then
begin
let set_hp_fi =
Expand Down Expand Up @@ -8459,9 +8479,25 @@ module FuncDec = struct
| Type.Shared Type.Query ->
Lifecycle.trans env Lifecycle.PostQuery
| Type.Shared Type.Composite ->
Lifecycle.trans env Lifecycle.Idle
(* Stay in composite query state such that callbacks of
composite queries can also use the memory reserve.
The state is isolated since memory changes of queries
are rolled back by the IC runtime system. *)
Lifecycle.trans env Lifecycle.InComposite
| _ -> assert false

let callback_start env =
Lifecycle.is_in env Lifecycle.InComposite ^^
G.if0
(G.nop)
(message_start env (Type.Shared Type.Write))

let callback_cleanup env =
Lifecycle.is_in env Lifecycle.InComposite ^^
G.if0
(G.nop)
(message_cleanup env (Type.Shared Type.Write))

let compile_const_message outer_env outer_ae sort control args mk_body ret_tys at : E.func_with_names =
let ae0 = VarEnv.mk_fun_ae outer_ae in
Func.of_body outer_env [] [] (fun env -> G.with_region at (
Expand Down Expand Up @@ -8618,7 +8654,7 @@ module FuncDec = struct
(fun env -> compile_unboxed_const 0l)))
in
Func.define_built_in env reply_name ["env", I32Type] [] (fun env ->
message_start env (Type.Shared Type.Write) ^^
callback_start env ^^
(* Look up continuation *)
let (set_closure, get_closure) = new_local env "closure" in
G.i (LocalGet (nr 0l)) ^^
Expand All @@ -8634,12 +8670,12 @@ module FuncDec = struct
get_closure ^^
Closure.call_closure env arity 0 ^^

message_cleanup env (Type.Shared Type.Write)
callback_cleanup env
);

let reject_name = "@reject_callback" in
Func.define_built_in env reject_name ["env", I32Type] [] (fun env ->
message_start env (Type.Shared Type.Write) ^^
callback_start env ^^
(* Look up continuation *)
let (set_closure, get_closure) = new_local env "closure" in
G.i (LocalGet (nr 0l)) ^^
Expand All @@ -8656,7 +8692,7 @@ module FuncDec = struct
get_closure ^^
Closure.call_closure env 1 0 ^^

message_cleanup env (Type.Shared Type.Write)
callback_cleanup env
);

(* result is a function that accepts a list of closure getters, from which
Expand Down
2 changes: 1 addition & 1 deletion test/bench/ok/alloc.drun-run-opt.ok
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
ingress Completed: Reply: 0x4449444c016c01b3c4b1f204680100010a00000000000000000101
ingress Completed: Reply: 0x4449444c0000
debug.print: (+268_435_456, 2_432_885_039)
debug.print: (+268_435_456, 2_432_930_095)
ingress Completed: Reply: 0x4449444c0000
debug.print: (+268_435_456, 2_432_749_871)
ingress Completed: Reply: 0x4449444c0000
Expand Down
2 changes: 1 addition & 1 deletion test/bench/ok/alloc.drun-run.ok
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
ingress Completed: Reply: 0x4449444c016c01b3c4b1f204680100010a00000000000000000101
ingress Completed: Reply: 0x4449444c0000
debug.print: (+268_435_456, 2_500_059_458)
debug.print: (+268_435_456, 2_500_108_610)
ingress Completed: Reply: 0x4449444c0000
debug.print: (+268_435_456, 2_499_907_906)
ingress Completed: Reply: 0x4449444c0000
Expand Down
4 changes: 2 additions & 2 deletions test/bench/ok/bignum.drun-run-opt.ok
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
ingress Completed: Reply: 0x4449444c016c01b3c4b1f204680100010a00000000000000000101
ingress Completed: Reply: 0x4449444c0000
debug.print: {cycles = 2_389_389; size = +59_652}
debug.print: {cycles = 2_389_400; size = +59_652}
ingress Completed: Reply: 0x4449444c0000
debug.print: {cycles = 102_989_128; size = +1_817_872}
debug.print: {cycles = 102_989_425; size = +1_817_872}
ingress Completed: Reply: 0x4449444c0000
4 changes: 2 additions & 2 deletions test/bench/ok/bignum.drun-run.ok
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
ingress Completed: Reply: 0x4449444c016c01b3c4b1f204680100010a00000000000000000101
ingress Completed: Reply: 0x4449444c0000
debug.print: {cycles = 2_493_575; size = +59_652}
debug.print: {cycles = 2_493_587; size = +59_652}
ingress Completed: Reply: 0x4449444c0000
debug.print: {cycles = 103_046_049; size = +1_817_872}
debug.print: {cycles = 103_046_373; size = +1_817_872}
ingress Completed: Reply: 0x4449444c0000
4 changes: 2 additions & 2 deletions test/bench/ok/heap-32.drun-run-opt.ok
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
ingress Completed: Reply: 0x4449444c016c01b3c4b1f204680100010a00000000000000000101
ingress Completed: Reply: 0x4449444c0000
debug.print: (50_227, +30_261_252, 620_394_287)
debug.print: (50_070, +32_992_212, 671_304_488)
debug.print: (50_227, +30_261_252, 620_399_314)
debug.print: (50_070, +32_992_212, 671_309_966)
ingress Completed: Reply: 0x4449444c0000
4 changes: 2 additions & 2 deletions test/bench/ok/heap-32.drun-run.ok
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
ingress Completed: Reply: 0x4449444c016c01b3c4b1f204680100010a00000000000000000101
ingress Completed: Reply: 0x4449444c0000
debug.print: (50_227, +30_261_252, 667_797_463)
debug.print: (50_070, +32_992_212, 720_521_360)
debug.print: (50_227, +30_261_252, 667_802_947)
debug.print: (50_070, +32_992_212, 720_527_336)
ingress Completed: Reply: 0x4449444c0000
6 changes: 6 additions & 0 deletions test/run-drun-non-ci/memory-reserve-composite.drun
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# INCREMENTAL-GC-ONLY
# SKIP ic-ref-run
install $ID memory-reserve-composite/memory-reserve-composite.mo ""
ingress $ID prepare1 "DIDL\x00\x00"
ingress $ID prepare2 "DIDL\x00\x00"
query $ID allocateInCompositeQuery "DIDL\x00\x00"
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
import Prim "mo:⛔";
actor {
stable var stableData = Prim.Array_tabulate<Nat>(1024 * 1024, func(index) { index });
var array0 : [var Nat] = [var];
var array1 : [var Nat] = [var];
var array2 : [var Nat] = [var];
var array3 : [var Nat] = [var];
Prim.debugPrint("Initialized " # debug_show (Prim.rts_memory_size()));

public func prepare1() : async () {
array0 := Prim.Array_init<Nat>(256 * 1024 * 1024, 0); // 1GB
array1 := Prim.Array_init<Nat>(256 * 1024 * 1024, 1); // 2GB
Prim.debugPrint("Prepared1 " # debug_show (Prim.rts_memory_size()));
};

public func prepare2() : async () {
array2 := Prim.Array_init<Nat>(256 * 1024 * 1024, 2); // 3GB
array3 := Prim.Array_init<Nat>(150 * 1024 * 1024, 3); // around 3.75GB
Prim.debugPrint("Prepared2 " # debug_show (Prim.rts_memory_size()));
};

public composite query func allocateInCompositeQuery() : async () {
ignore Prim.Array_init<Nat>(50 * 1024 * 1024, 4);
Prim.debugPrint("Composite query call " # debug_show (Prim.rts_memory_size()));
assert (Prim.rts_memory_size() > 3840 * 1024 * 1024);
await nestedQuery();
ignore Prim.Array_init<Nat>(5 * 1024 * 1024, 4);
Prim.debugPrint("Composite query callback " # debug_show (Prim.rts_memory_size()));
assert (Prim.rts_memory_size() > 3840 * 1024 * 1024);
};

public query func nestedQuery() : async () {
Prim.debugPrint("Nested query " # debug_show (Prim.rts_memory_size()));
};
};

//SKIP run
//SKIP run-ir
//SKIP run-low
6 changes: 6 additions & 0 deletions test/run-drun-non-ci/memory-reserve-query.drun
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# INCREMENTAL-GC-ONLY
# SKIP ic-ref-run
install $ID memory-reserve-query/memory-reserve-query.mo ""
ingress $ID prepare1 "DIDL\x00\x00"
ingress $ID prepare2 "DIDL\x00\x00"
query $ID allocateInQuery "DIDL\x00\x00"
31 changes: 31 additions & 0 deletions test/run-drun-non-ci/memory-reserve-query/memory-reserve-query.mo
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
import Prim "mo:⛔";
actor {
stable var stableData = Prim.Array_tabulate<Nat>(1024 * 1024, func(index) { index });
var array0 : [var Nat] = [var];
var array1 : [var Nat] = [var];
var array2 : [var Nat] = [var];
var array3 : [var Nat] = [var];
Prim.debugPrint("Initialized " # debug_show (Prim.rts_memory_size()));

public func prepare1() : async () {
array0 := Prim.Array_init<Nat>(256 * 1024 * 1024, 0); // 1GB
array1 := Prim.Array_init<Nat>(256 * 1024 * 1024, 1); // 2GB
Prim.debugPrint("Prepared1 " # debug_show (Prim.rts_memory_size()));
};

public func prepare2() : async () {
array2 := Prim.Array_init<Nat>(256 * 1024 * 1024, 2); // 3GB
array3 := Prim.Array_init<Nat>(150 * 1024 * 1024, 3); // around 3.75GB
Prim.debugPrint("Prepared2 " # debug_show (Prim.rts_memory_size()));
};

public query func allocateInQuery() : async () {
ignore Prim.Array_init<Nat>(50 * 1024 * 1024, 4);
Prim.debugPrint("Query call " # debug_show (Prim.rts_memory_size()));
assert (Prim.rts_memory_size() > 3840 * 1024 * 1024);
};
};

//SKIP run
//SKIP run-ir
//SKIP run-low
6 changes: 6 additions & 0 deletions test/run-drun-non-ci/memory-reserve-update.drun
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# INCREMENTAL-GC-ONLY
# SKIP ic-ref-run
install $ID memory-reserve-update/memory-reserve-update.mo ""
ingress $ID prepare1 "DIDL\x00\x00"
ingress $ID prepare2 "DIDL\x00\x00"
ingress $ID allocateInUpdate "DIDL\x00\x00"
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
import Prim "mo:⛔";
actor {
stable var stableData = Prim.Array_tabulate<Nat>(1024 * 1024, func(index) { index });
var array0 : [var Nat] = [var];
var array1 : [var Nat] = [var];
var array2 : [var Nat] = [var];
var array3 : [var Nat] = [var];
Prim.debugPrint("Initialized " # debug_show (Prim.rts_memory_size()));

public func prepare1() : async () {
array0 := Prim.Array_init<Nat>(256 * 1024 * 1024, 0); // 1GB
array1 := Prim.Array_init<Nat>(256 * 1024 * 1024, 1); // 2GB
Prim.debugPrint("Prepared1 " # debug_show (Prim.rts_memory_size()));
};

public func prepare2() : async () {
array2 := Prim.Array_init<Nat>(256 * 1024 * 1024, 2); // 3GB
array3 := Prim.Array_init<Nat>(150 * 1024 * 1024, 3); // around 3.75GB
Prim.debugPrint("Prepared2 " # debug_show (Prim.rts_memory_size()));
};

public func allocateInUpdate() : async () {
Prim.debugPrint("Update call " # debug_show (Prim.rts_memory_size()));
ignore Prim.Array_init<Nat>(50 * 1024 * 1024, 4);
};
};

//SKIP run
//SKIP run-ir
//SKIP run-low
6 changes: 6 additions & 0 deletions test/run-drun-non-ci/memory-reserve-upgrade.drun
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# INCREMENTAL-GC-ONLY
# SKIP ic-ref-run
install $ID memory-reserve-upgrade/memory-reserve-upgrade.mo ""
ingress $ID prepare1 "DIDL\x00\x00"
ingress $ID prepare2 "DIDL\x00\x00"
upgrade $ID memory-reserve-upgrade/memory-reserve-upgrade.mo ""
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
import Prim "mo:⛔";
actor {
stable var stableData = Prim.Array_tabulate<Nat>(1024 * 1024, func(index) { index });
var array0 : [var Nat] = [var];
var array1 : [var Nat] = [var];
var array2 : [var Nat] = [var];
var array3 : [var Nat] = [var];
Prim.debugPrint("Initialized " # debug_show (Prim.rts_memory_size()));

public func prepare1() : async () {
array0 := Prim.Array_init<Nat>(256 * 1024 * 1024, 0); // 1GB
array1 := Prim.Array_init<Nat>(256 * 1024 * 1024, 1); // 2GB
Prim.debugPrint("Prepared1 " # debug_show (Prim.rts_memory_size()));
};

public func prepare2() : async () {
array2 := Prim.Array_init<Nat>(256 * 1024 * 1024, 2); // 3GB
array3 := Prim.Array_init<Nat>(150 * 1024 * 1024, 3); // around 3.75GB
Prim.debugPrint("Prepared2 " # debug_show (Prim.rts_memory_size()));
};
};

//SKIP run
//SKIP run-ir
//SKIP run-low
11 changes: 11 additions & 0 deletions test/run-drun/ok/memory-reserve-composite.drun.ok
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
ingress Completed: Reply: 0x4449444c016c01b3c4b1f204680100010a00000000000000000101
debug.print: Initialized 33_554_432
ingress Completed: Reply: 0x4449444c0000
debug.print: Prepared1 2_248_146_944
ingress Completed: Reply: 0x4449444c0000
debug.print: Prepared2 4_026_531_840
ingress Completed: Reply: 0x4449444c0000
debug.print: Composite query call 4_261_412_864
debug.print: Nested query 4_026_531_840
debug.print: Composite query callback 4_261_412_864
Ok: Reply: 0x4449444c0000
Loading

0 comments on commit 01e2f02

Please sign in to comment.