Benchmark Visualizer: Conjure Native vs Oxide #291

PedroGGBM · 2024-04-03T21:48:50Z

The following PR contains an interactive dashboard using Dash to display relevant statistics comparing solvers from Conjure Native versus Conjure Oxide:

Num of Nodes (SavileRow)
Time Elapsed (milliseconds)

All instructions on how to execute the program app.py file and run the shell/Rust script to get solver statistics for exhaustive tests are included in a README.md file. Please refer to said file for Python environment setup.

Extensions:

Git submodule for exhaustive Conjure tests; be able to update according to changes to these tests
Inputting .essence files to create new models, and change .param file values to get relevant testing statistics for each solver
[Long term] A recommending system based on the which statistics favor which solver depending on the solver

Please note that, although node count for SavileRow is only relevant when comparing same solver across Oxide vs Native (i.e. Minion), other Conjure Native node counts are still included for when there is full support for other Oxide solvers.

Examples:

(Based on Georgii's Issue #248 on a web UI for Native vs Oxide benchmarks)

…d capture of stdout for cargo test]

…cript

…ashboard

…tive tests) and conjure oxide automated shell script cargo run (large because of different local mixed commits merged

…ualizer for node count and time elapsed. NOTE: commit includes several snippets from previous week branches!

…mark visualization

…riations

niklasdewally · 2024-04-04T13:02:59Z

@PedroGGBM any change compared to the demo, or is this the same?

Also does it compile to a static webpage, or is a server required?

PedroGGBM · 2024-04-04T13:18:07Z

@PedroGGBM any change compared to the demo, or is this the same?

Also does it compile to a static webpage, or is a server required?

This is the same with some slight bug fixes and further documentation (look at ./tools/benchmark-visualizer/README.md. It is server-required; Oz mentioned the possibility of hosting it on a Github page.

For now, if you want to visualize it before/after optimizing Minion Oxide, simply run the build.sh script to get new solution stats and go to http://127.0.0.1:8050/!

niklasdewally · 2024-04-04T13:41:12Z

It is server-required; Oz mentioned the possibility of hosting it on a Github page.

To clarify, Github pages hosts static files (i.e serverless stuff) only. Probably not a problem, just something to consider

ozgurakgun · 2024-04-04T15:33:47Z

I think the dynamic server side is a problem for this actually.

The main goal is checking the correctness of conjure-oxide (starting from conjure-oxide tests) with respect to conjure. This can be done as part of the existing tester and outputs shown as test results. Making a static dashboard page for easy exploration of the test results would be good as well. As it stands I think this PR tries to do too much and not enough at the same time. We need to iterate on the goals of this work before landing the PR.

niklasdewally · 2024-04-04T15:46:55Z

@ozgurakgun @PedroGGBM

If static hosting is a requirement, I think Quarto (what used to be Rmarkdown but now for Python too) has static dashboards now that are fairly plotting library agnostic (so might work with the existing plots)
https://quarto.org/docs/dashboards/deployment.html#static-dashboards

niklasdewally · 2024-04-04T15:50:16Z

It also has observableJS which might(?) be interactive and serverless, but even without interactivity its generally a good tool for turning code notebooks (Jupyter or R) into reports and websites so is probably worth a look

niklasdewally · 2024-04-04T15:54:50Z

https://jjallaire.github.io/gapminder-dashboard/
https://github.com/jjallaire/gapminder-dashboard/blob/main/gapminder.qmd

This seems to be plotly :)

PedroGGBM · 2024-04-04T20:44:18Z

Thank you @niklasdewally for the Quarto recommendation :) Just added support for the static HTML dashboard generation, and edited README.md and build.sh correspondingly; all works like a charm.

@ozgurakgun apologies for the delay in getting this in. I suppose now it has a both a dynamic and static dashboard... I'll include this in my final proj report as well.

niklasdewally · 2024-04-04T23:12:50Z

Definitely a step in the right direction. Here's some of my thoughts about this in general - these are not necessarily problems with the PR or things that can be done this semester, just things that come to mind:

You could put the plotly code inside the notebook itself instead of an image. It runs Jupyter under the hood so can run any code Jupyter does.
Quarto also supports Jupyter widgets - I am fairly sure widgets exist that do the dynamic dropdown with list of tests and so on as Dash does. @ozgurakgun is a notebooks user so might know some useful widgets that can go on this? Maybe https://plotly.com/python/figurewidget-app/ ?
Pretty sure R plotly can do dropdowns to select categories in Javascript, so there is probably a way in Python too (https://plotly.com/python/dropdowns/?).
I don't want to generate the data all myself on my laptop - is there an example or any images?
I would gitignore image files and data - it took a while to clone and the diff crashes my Github.
This might be useful as well https://quarto.org/docs/computations/parameters.html
Whats the deployment landscape for this? Also is the data being cached somewhere?
If we stick with quarto for dashboards and things like that (I like it, but I am biased towards R things) it seems to have a well defined publication flow for Github Actions we could steal from (https://quarto.org/docs/publishing/github-pages.html)

PedroGGBM · 2024-04-04T23:37:12Z

Thank you @niklasdewally for the thoughts! The whole Jupyter widgets extension might be a good project to do over the summer or next semester. The images as stored in ./tools/benchmarks-visualizer/figures. I just realized I didn't add this dir to the .gitignore, but I presume you can see them now.

As for the deployment landscape, right now it's limited to running the build.sh, which allows the user to re-generate the solution stats for each test respective to different solvers. All solution stats are stored in ./data directory, and all tests (exhaustive tests originally in Conjure repo) are in ./tests. Git submodules gave loads of problems, so I directly copied all the tests to conjure-oxide (I hope that's ok; they can be used as integration tests in the future). All of this should be detailed in the README.md file for easy follow-through.

Thanks again! These are some really good further development ideas.

ozgurakgun · 2024-04-05T09:57:29Z

conjure_oxide/tests/cpu_time_tests.rs

@@ -0,0 +1,52 @@
+// // cpu_time_tests.rs


file entirely commented out. should it be here?

ozgurakgun · 2024-04-05T09:58:21Z

tools/benchmarks-visualizer/README.md

+quarto render ./html/dashboard.qmd
+```
+
+## RUN DYNAMIC DASHBOARD MANUALLY


I suggest we remove the dynamic dashboard.

ozgurakgun · 2024-04-05T09:59:23Z

tools/benchmarks-visualizer/figures/nodes_autogen_gen01.png

we should not commit the pngs to the repo. they should be generated as part of the static dashboard build and pushed to the gh-pages branch. ideally in a gh-action.

ozgurakgun · 2024-04-05T09:59:45Z

tools/benchmarks-visualizer/html/dashboard.html

ditto for the generated html.

ozgurakgun · 2024-04-05T10:01:30Z

tools/benchmarks-visualizer/tests/exhaustive/README.md

which subset of tests are we importing? not sure about wholesale import of a large number of tests before we are happy with the testing functionality itself. maybe import a few as part of this PR and we can easily add more later once the scaffolding is done?

…riginal conjure repo

…arison in Node graphs & fixed bugs

PedroGGBM added 7 commits April 3, 2024 22:03

testing backframe for cpu time using instant [for all tests, attempte…

3ba15a1

…d capture of stdout for cargo test]

init changes base on samvit dashboard and draft for rust subprocess s…

70d444f

…cript

large commit for initial conjure native solution scraper and python d…

8b4e4bf

…ashboard

[MIXED COMMIT] data scraping for both conjure native (based on exhaus…

eb3aa9b

…tive tests) and conjure oxide automated shell script cargo run (large because of different local mixed commits merged

[MIXED COMMIT] finalized dashboard using Dash for oxide vs native vis…

531b337

…ualizer for node count and time elapsed. NOTE: commit includes several snippets from previous week branches!

large commit for init of dashboard for conjure native vs. oxide bench…

bf69915

…mark visualization

clean up of ./data directory, eliminated env (accident), and misc tasks

23a5dc5

PedroGGBM requested review from niklasdewally and lixitrixi April 3, 2024 21:48

PedroGGBM added kind::feature New feature or request dependencies::python Automated pull requests that update Python code. area::ci Related to CI, coverage, Github, etc. kind::testing Testing and Correctness labels Apr 3, 2024

PedroGGBM requested a review from ozgurakgun April 3, 2024 21:49

commented cpu_time_tests.rs testing script and added more xyz test va…

edd864b

…riations

niklasdewally approved these changes Apr 4, 2024

View reviewed changes

added support for static html dashboard by using Quarto for generatation

a305a17

ozgurakgun requested changes Apr 5, 2024

View reviewed changes

PedroGGBM added this to the Better integration testing milestone Apr 5, 2024

Merge branch 'main' into benchmark-visualizer

1a61cfd

PedroGGBM and others added 3 commits May 8, 2024 17:44

[FIX] removed dynamic dashboard, removed several exhaustive test of o…

7e11b86

…riginal conjure repo

[FIX & NEW] Added grouped bars for native vs oxide Minion solver comp…

1d67c82

…arison in Node graphs & fixed bugs

Merge branch 'main' into benchmark-visualizer

87628d7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark Visualizer: Conjure Native vs Oxide #291

Benchmark Visualizer: Conjure Native vs Oxide #291

PedroGGBM commented Apr 3, 2024 •

edited

Loading

niklasdewally commented Apr 4, 2024 •

edited

Loading

PedroGGBM commented Apr 4, 2024

niklasdewally commented Apr 4, 2024 •

edited

Loading

ozgurakgun commented Apr 4, 2024

niklasdewally commented Apr 4, 2024 •

edited

Loading

niklasdewally commented Apr 4, 2024 •

edited

Loading

niklasdewally commented Apr 4, 2024 •

edited

Loading

PedroGGBM commented Apr 4, 2024

niklasdewally commented Apr 4, 2024 •

edited

Loading

PedroGGBM commented Apr 4, 2024

ozgurakgun Apr 5, 2024

ozgurakgun Apr 5, 2024

ozgurakgun Apr 5, 2024

ozgurakgun Apr 5, 2024

ozgurakgun Apr 5, 2024

Benchmark Visualizer: Conjure Native vs Oxide #291

Are you sure you want to change the base?

Benchmark Visualizer: Conjure Native vs Oxide #291

Conversation

PedroGGBM commented Apr 3, 2024 • edited Loading

niklasdewally commented Apr 4, 2024 • edited Loading

PedroGGBM commented Apr 4, 2024

niklasdewally commented Apr 4, 2024 • edited Loading

ozgurakgun commented Apr 4, 2024

niklasdewally commented Apr 4, 2024 • edited Loading

niklasdewally commented Apr 4, 2024 • edited Loading

niklasdewally commented Apr 4, 2024 • edited Loading

PedroGGBM commented Apr 4, 2024

niklasdewally commented Apr 4, 2024 • edited Loading

PedroGGBM commented Apr 4, 2024

ozgurakgun Apr 5, 2024

Choose a reason for hiding this comment

ozgurakgun Apr 5, 2024

Choose a reason for hiding this comment

ozgurakgun Apr 5, 2024

Choose a reason for hiding this comment

ozgurakgun Apr 5, 2024

Choose a reason for hiding this comment

ozgurakgun Apr 5, 2024

Choose a reason for hiding this comment

PedroGGBM commented Apr 3, 2024 •

edited

Loading

niklasdewally commented Apr 4, 2024 •

edited

Loading

niklasdewally commented Apr 4, 2024 •

edited

Loading

niklasdewally commented Apr 4, 2024 •

edited

Loading

niklasdewally commented Apr 4, 2024 •

edited

Loading

niklasdewally commented Apr 4, 2024 •

edited

Loading

niklasdewally commented Apr 4, 2024 •

edited

Loading