Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap to automate ABTF benchmarking via CM #6

Open
22 of 34 tasks
gfursin opened this issue Apr 24, 2024 · 0 comments
Open
22 of 34 tasks

Roadmap to automate ABTF benchmarking via CM #6

gfursin opened this issue Apr 24, 2024 · 0 comments

Comments

@gfursin
Copy link
Contributor

gfursin commented Apr 24, 2024

Current tasks

The goal is to decompose ABTF repository into reusable automation recipes to make the benchmarking, evaluation and training process of ABTF models more deterministic, portable and extensible with new models, frameworks and data sets ....

We need to develop the following CM scripts (automation recipes) to support ABTF benchmarking with loadgen across different platforms, OS and hardware:

See the current CM-ABTF documentation/demos here.

Preparing ABTF demo

@gfursin helped to prepare first CM automation for ABTF and we now plan to delegate further developments to dedicated engineers.

Test inference with ABTF model

  • Automate Cognata downloading with custom sub-sets
    • Download to CM cache
    • Import already downloaded dataset
    • Check download of individual files
      • Add variation for a demo (1 min video)
  • Download current ABTF model and register in CM cache
  • Download trained ABTF models via CM

Export ABTF model to other formats

  • Export PyTorch model to ONNX

Evaluate ABTF model with Cognata sub-set

  • Sync with Rod to access server and test CM automation
  • "Decode" function for standalone evaluation of a given image (mAP) to be integrated with loadgen

Automate training of ABTF model with Cognata sub-set

  • Sync with Rod to access server

Add Python harness for loadgen with ABTF model

  • Implement Python loadgen harness for ABTF model to measure performance (1 sample)
    • Pre-load and pre-process all samples from Cognata
  • Implement Python laodgen harness for ABTF model to measure accuracy (1 sample)
    • Pre-load and pre-process all samples from Cognata

See related CM script and simple Python harness.

Generate/use Docker containers

  • Prepare examples of docker containers with CM: see examples

Demos

  • Prepare demo for live ABTF model evaluation
    • Download Cognata subset
    • Show live visualization of predictions
    • Document

For the next tasks we need more engineering resources.

MLCommons committed to fund CM development with 1 CM engineer until the end of 2024 to modularize and automate MLPerf inference. ABTF colleagues should sync developments with the MLPerf inference WG.

Improve performance

  • Add performance profiling, analysis and debugging
  • Current performance on 8-core CPU and Laptop GPU is low (10 sec per frame for 8M model and 3 sec per frame for 3M model on CPU) - need further optimization (quantization, hardware specific optimizations, fine-tuning, etc)

Add C++ harness for loadgen with ABTF model

  • Develop C++ harness for loadgen with ONNX
  • Export PyTorch model to TFLite
    • Develop native C++ harness for loadgen test with TFLite model
  • Develop C++ harness for loadgen with PyTorch

Support other hardware

PyTorch native

  • Support ABTF demo on Nvidia GPU via CUDA
    • Generate Docker container for the demo

Cross-compilation

  • Samsung Exynos
    • Requires C++ loadgen harness implementation with cross-compilation
    • ONNX backend
    • TFLite backend

Automate ABTF model quantization

TBD

Developers

ABTF model

  • Radoyeh Shojaei

CM automation for ABTF model

  • @gfursin has completed a prototype of a CM automation and MLPerf harness for ABTF model in May 2024. Further developments should be done by MLCommons CM inference engineer.
@gfursin gfursin self-assigned this Apr 24, 2024
@gfursin gfursin changed the title Roadmap for CM developments to support ABTF benchmarking Roadmap to automate ABTF benchmarking via CM Apr 29, 2024
@gfursin gfursin removed their assignment Jun 3, 2024
arjunsuresh added a commit that referenced this issue Jun 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant