Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regions have non-trivial overhead #382

Open
Kixiron opened this issue Apr 13, 2021 · 0 comments
Open

Regions have non-trivial overhead #382

Kixiron opened this issue Apr 13, 2021 · 0 comments

Comments

@Kixiron
Copy link
Contributor

Kixiron commented Apr 13, 2021

Regions have non-trivial overhead and cost a lot more than they should. Ideally a nested region (not scope) should be near-free aside from the cost of doing logging, but using each of the three versions of this example shows dramatic performance differences between them

Version 1 (2 regions) Version 2 (1 region) Version 3 (No regions)
30,000 iterations 12s 8s 5s

I know there's extra things going on because of the .enter() and .leave() calls as well as the region subgraph operators themselves, but it's still a significant difference that grows even more apparent on larger applications

use timely::dataflow::{
    operators::{Enter, Exchange, Input, Inspect, Leave, Probe},
    InputHandle, ProbeHandle, Scope,
};

fn main() {
    timely::execute_from_args(std::env::args(), |worker| {
        let index = worker.index();
        let mut input = InputHandle::new();
        let mut probe = ProbeHandle::new();

        worker.dataflow(|scope| {
            let data = scope.input_from(&mut input);

            // Version 1
            scope
                .region(|inner| {
                    let data = data.enter(inner);
                    inner.region(|inner2| data.enter(inner2).leave()).leave()
                })
                .inspect(move |x| println!("worker {}:\thello {}", index, x))
                .probe_with(&mut probe);

            // Version 2
            scope
                .region(|inner| {
                    data.enter(inner).leave()
                })
                .inspect(move |x| println!("worker {}:\thello {}", index, x))
                .probe_with(&mut probe);

            // Version 3
            data
                .inspect(move |x| println!("worker {}:\thello {}", index, x))
                .probe_with(&mut probe);
        });

        for round in 0..30000 {
            if index == 0 {
                input.send(round);
            }

            input.advance_to(round + 1);
            while probe.less_than(input.time()) {
                worker.step_or_park(None);
            }
        }
    }).unwrap();
}

A potential solution I can see would be to make a truly specialized (and separate) version of Subgraph that doesn't do any of the progress or input/output management that Subgraph does apart from the absolute minimum to have logging stay intact

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant