Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: delay() function for marking key points in the flame graph #141

Open
andrewmarx opened this issue Mar 3, 2023 · 7 comments
Open

Comments

@andrewmarx
Copy link

andrewmarx commented Mar 3, 2023

I've written a function that might be worth including in this package (or something similar). I sometimes end up in a situation where it's very difficult to distinguish between parts of a function inside the flame graph (especially because the bulk of my code is just loops, assignments, and basic operators), so I have to add code to help visually separate sections of the function so I can figure out which part actually needs optimization. To help with that, I've created the following function that I can call at key points in the code I'm trying to profile:

  delay <- function(text, t = 1) {

    assign(text, function() {
      t0 <- Sys.time()
      while (Sys.time() - t0 < t) { }
    })

    do.call(text, list())
  }

The text parameter gets assigned as a function name that will then show up in the flame graph. Here's an example:
Screenshot from 2023-03-03 13-59-15

Basically, I called delay("CRW Lookup") and delay("Build TR") in places that delimit important sections. I thought my "CRW Lookup" section might be a possible bottleneck because it's a complex nested loop, but it turns out it's the opposite case, and there was no way I could ever figure this out based on the rest of the flame graph data.

Edit: calling it something like marker() might make more sense

@andrewmarx
Copy link
Author

Some extra thoughts to make this a more powerful feature:

  • If the flame graph viewer could automatically recognize the markers and replace them in the graph with visual breaks and ignore the time they take up when calculating the total run time
  • Instead of peppering the function code with function calls, make it a feature activated from within RStudio itself. Possibly something similar to RStudio's breakpoints.

@hadley
Copy link
Member

hadley commented Aug 21, 2024

I'm not sure I quite get this without more context. Why not just use a regular function call without adding any additional delay?

@andrewmarx
Copy link
Author

Here's an example of a "fast" call to one of my functions:
flamegraph

It has five major sections that potentially could be the focus of optimization, but at a high level, I can only really identify where the first one is and how long it takes to run. The other four are indistinguishable because they mainly consist of basic logic, loops, and math that don't show anything distinguishable on the graph, with just a few function calls that barely show. In order to figure out where the slow sections are, I spend a lot of time zooming in/out and dragging side-to-side to try and find useful landmarks for each section of code.

This is an example of using the delay function (colored in with red):
keypoints
By having the delay function run for a second, it's large enough that I can see where each of the five sections starts (the last cleanup section is basically nothing). If I use a function that just does nothing and exits, it wouldn't be visible at this scale. The tunable time parameter lets me make sure it's always easy to see at large time scales.

Now, if there was a function like this built in and there was a way for profvis to recognize it in the call graph, conceivably, it would automatically be able to scale it with the zoom level, then it would just need to run long enough for profvis to catch it and something like the time parameter wouldn't be needed

@hadley
Copy link
Member

hadley commented Aug 22, 2024

Could you try this?

p <- profvis(...)
print(p, aggregate = TRUE)

I think that's likely to give you a more useful display by pulling together the repeated function calls.

(It's not very well documented currently but fixing that is on my to do list)

@andrewmarx
Copy link
Author

aggregate

I could definitely see that being a useful feature.

But in a case like mine, the time and memory costs aren't in function calls. It's mostly in complicated loops calculating values for extremely large vectors without unnecessary memory allocations. Most of the optimization is in how indices are calculated, memory is accessed, and how the values are calculated. My delay function has been helpful because it gives me a way to quickly visualize which loops take the longest to run and need the most work.

@hadley
Copy link
Member

hadley commented Aug 23, 2024

If that's all you're using it for, isn't that a single that you should just pull out some named functions to make the structure of your code more clear to profvis? I'm not convinced that this is a general problem that profvis needs to provide help for.

@andrewmarx
Copy link
Author

In general, that would be an option. In my case, removing the chunks of code from their own functions was itself a significant optimization because of R's copy behavior. It's also now at a point where many bits and pieces are precalculated, and putting it back into a dedicated function would involve passing in about a dozen objects.

To be clear, I don't view this as a problem that needs to be solved for me. Rather, it's a feature suggestion that maybe some other people might find useful. My approach has been helpful for me, but the idea of injecting markers into the flamegraph/callstack could be way more elegant if it were a package feature. Something like a marker(text, color) function that doesn't need fiddling with a delay parameter because Profvis recognizes it as special and creates cleaner markers than what I'm doing. Like this:

flamegraph_markers

If you decide it's not worth implementing, that's fine. It's not a critical feature, and I wouldn't fight it; I'm just hoping I've managed to convey the idea properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants