Skip to content

Commit

Permalink
[PAL/vm-common] Check the "break" variable less frequently in delay()
Browse files Browse the repository at this point in the history
Previously, `delay()` function accessed the "break out of loop early"
variable `continue_gate` basically on every CPU cycle. This variable
is typically a global variable causing high contention on multi-core
workloads. This e.g. manifested in the Candle Quantized LLaMA app.

This commit fixes this by checking the variable less frequently.
The current heuristic is to check it every 1 ms.

Signed-off-by: dimstav23 <[email protected]>
  • Loading branch information
dimakuv authored and dimstav23 committed Jul 31, 2024
1 parent f3e58b4 commit 4a973e6
Showing 1 changed file with 7 additions and 2 deletions.
9 changes: 7 additions & 2 deletions pal/src/host/vm-common/kernel_time.c
Original file line number Diff line number Diff line change
Expand Up @@ -58,9 +58,14 @@ int delay(uint64_t delay_us, bool* continue_gate) {
uint64_t curr_tsc = get_tsc();
uint64_t wait_until_tsc = curr_tsc + delay_us * g_tsc_mhz;

uint64_t next_gate_check_tsc = curr_tsc + 1000 * g_tsc_mhz; /* check every 1ms */

while (curr_tsc < wait_until_tsc) {
if (continue_gate && __atomic_load_n(continue_gate, __ATOMIC_ACQUIRE))
break;
if (curr_tsc > next_gate_check_tsc) {
if (continue_gate && __atomic_load_n(continue_gate, __ATOMIC_ACQUIRE))
break;
next_gate_check_tsc = curr_tsc + 1000 * g_tsc_mhz;
}
CPU_RELAX();
curr_tsc = get_tsc();
}
Expand Down

0 comments on commit 4a973e6

Please sign in to comment.