Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark Visualizer: Conjure Native vs Oxide #291

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

PedroGGBM
Copy link
Contributor

@PedroGGBM PedroGGBM commented Apr 3, 2024

The following PR contains an interactive dashboard using Dash to display relevant statistics comparing solvers from Conjure Native versus Conjure Oxide:

  • Num of Nodes (SavileRow)
  • Time Elapsed (milliseconds)

All instructions on how to execute the program app.py file and run the shell/Rust script to get solver statistics for exhaustive tests are included in a README.md file. Please refer to said file for Python environment setup.

Extensions:

  • Git submodule for exhaustive Conjure tests; be able to update according to changes to these tests
  • Inputting .essence files to create new models, and change .param file values to get relevant testing statistics for each solver
  • [Long term] A recommending system based on the which statistics favor which solver depending on the solver

Please note that, although node count for SavileRow is only relevant when comparing same solver across Oxide vs Native (i.e. Minion), other Conjure Native node counts are still included for when there is full support for other Oxide solvers.

Examples:
image

image

(Based on Georgii's Issue #248 on a web UI for Native vs Oxide benchmarks)

@PedroGGBM PedroGGBM added kind::feature New feature or request dependencies::python Automated pull requests that update Python code. area::ci Related to CI, coverage, Github, etc. kind::testing Testing and Correctness labels Apr 3, 2024
@niklasdewally
Copy link
Contributor

niklasdewally commented Apr 4, 2024

@PedroGGBM any change compared to the demo, or is this the same?

Also does it compile to a static webpage, or is a server required?

@PedroGGBM
Copy link
Contributor Author

@PedroGGBM any change compared to the demo, or is this the same?

Also does it compile to a static webpage, or is a server required?

This is the same with some slight bug fixes and further documentation (look at ./tools/benchmark-visualizer/README.md. It is server-required; Oz mentioned the possibility of hosting it on a Github page.

For now, if you want to visualize it before/after optimizing Minion Oxide, simply run the build.sh script to get new solution stats and go to http://127.0.0.1:8050/!

@niklasdewally
Copy link
Contributor

niklasdewally commented Apr 4, 2024

It is server-required; Oz mentioned the possibility of hosting it on a Github page.

To clarify, Github pages hosts static files (i.e serverless stuff) only. Probably not a problem, just something to consider

@ozgurakgun
Copy link
Contributor

I think the dynamic server side is a problem for this actually.

The main goal is checking the correctness of conjure-oxide (starting from conjure-oxide tests) with respect to conjure. This can be done as part of the existing tester and outputs shown as test results. Making a static dashboard page for easy exploration of the test results would be good as well. As it stands I think this PR tries to do too much and not enough at the same time. We need to iterate on the goals of this work before landing the PR.

@niklasdewally
Copy link
Contributor

niklasdewally commented Apr 4, 2024

@ozgurakgun @PedroGGBM

If static hosting is a requirement, I think Quarto (what used to be Rmarkdown but now for Python too) has static dashboards now that are fairly plotting library agnostic (so might work with the existing plots)
https://quarto.org/docs/dashboards/deployment.html#static-dashboards

@niklasdewally
Copy link
Contributor

niklasdewally commented Apr 4, 2024

It also has observableJS which might(?) be interactive and serverless, but even without interactivity its generally a good tool for turning code notebooks (Jupyter or R) into reports and websites so is probably worth a look

@niklasdewally
Copy link
Contributor

niklasdewally commented Apr 4, 2024

@PedroGGBM
Copy link
Contributor Author

Thank you @niklasdewally for the Quarto recommendation :) Just added support for the static HTML dashboard generation, and edited README.md and build.sh correspondingly; all works like a charm.

@ozgurakgun apologies for the delay in getting this in. I suppose now it has a both a dynamic and static dashboard... I'll include this in my final proj report as well.

@niklasdewally
Copy link
Contributor

niklasdewally commented Apr 4, 2024

Definitely a step in the right direction. Here's some of my thoughts about this in general - these are not necessarily problems with the PR or things that can be done this semester, just things that come to mind:

  • You could put the plotly code inside the notebook itself instead of an image. It runs Jupyter under the hood so can run any code Jupyter does.
  • Quarto also supports Jupyter widgets - I am fairly sure widgets exist that do the dynamic dropdown with list of tests and so on as Dash does. @ozgurakgun is a notebooks user so might know some useful widgets that can go on this? Maybe https://plotly.com/python/figurewidget-app/ ?
  • Pretty sure R plotly can do dropdowns to select categories in Javascript, so there is probably a way in Python too (https://plotly.com/python/dropdowns/?).
  • I don't want to generate the data all myself on my laptop - is there an example or any images?
  • I would gitignore image files and data - it took a while to clone and the diff crashes my Github.
  • This might be useful as well https://quarto.org/docs/computations/parameters.html
  • Whats the deployment landscape for this? Also is the data being cached somewhere?
  • If we stick with quarto for dashboards and things like that (I like it, but I am biased towards R things) it seems to have a well defined publication flow for Github Actions we could steal from (https://quarto.org/docs/publishing/github-pages.html)

@PedroGGBM
Copy link
Contributor Author

Thank you @niklasdewally for the thoughts! The whole Jupyter widgets extension might be a good project to do over the summer or next semester. The images as stored in ./tools/benchmarks-visualizer/figures. I just realized I didn't add this dir to the .gitignore, but I presume you can see them now.

As for the deployment landscape, right now it's limited to running the build.sh, which allows the user to re-generate the solution stats for each test respective to different solvers. All solution stats are stored in ./data directory, and all tests (exhaustive tests originally in Conjure repo) are in ./tests. Git submodules gave loads of problems, so I directly copied all the tests to conjure-oxide (I hope that's ok; they can be used as integration tests in the future). All of this should be detailed in the README.md file for easy follow-through.

Thanks again! These are some really good further development ideas.

@@ -0,0 +1,52 @@
// // cpu_time_tests.rs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file entirely commented out. should it be here?

quarto render ./html/dashboard.qmd
```

## RUN DYNAMIC DASHBOARD MANUALLY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we remove the dynamic dashboard.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should not commit the pngs to the repo. they should be generated as part of the static dashboard build and pushed to the gh-pages branch. ideally in a gh-action.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto for the generated html.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which subset of tests are we importing? not sure about wholesale import of a large number of tests before we are happy with the testing functionality itself. maybe import a few as part of this PR and we can easily add more later once the scaffolding is done?

@PedroGGBM PedroGGBM added this to the Better integration testing milestone Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area::ci Related to CI, coverage, Github, etc. dependencies::python Automated pull requests that update Python code. kind::feature New feature or request kind::testing Testing and Correctness
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants