Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Habana 1.17.1 #695

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

daniellutz
Copy link
Contributor

@daniellutz daniellutz commented Sep 2, 2024

Add support for Habana 1.17.1 based on Python 3.11, same structure from #533.

Description

directly link to image: quay.io/dlutz/workbench-images:habana-jupyter-1.17.1-ubi9-python-3.11-2024a_20240916

How Has This Been Tested?

This has been manually built and tested

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Copy link
Contributor

openshift-ci bot commented Sep 2, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@daniellutz
Copy link
Contributor Author

@atheo89 I did the squash already, so I don't know if the label will be required anymore.

@daniellutz
Copy link
Contributor Author

/retest

Copy link
Member

@atheo89 atheo89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Daniel, I found some minor changes that are needed.

habana/1.17.1/ubi9-python-3.11/Pipfile Outdated Show resolved Hide resolved
Copy link
Member

@jstourac jstourac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this. I put couple of comments.

habana/1.17.1/ubi9-python-3.11/README.md Outdated Show resolved Hide resolved
habana/1.17.1/ubi9-python-3.11/README.md Show resolved Hide resolved
habana/1.17.1/ubi9-python-3.11/README.md Outdated Show resolved Hide resolved
habana/1.17.1/ubi9-python-3.11/test/test_notebook.ipynb Outdated Show resolved Hide resolved
habana/README.md Outdated Show resolved Hide resolved
base/ubi9-python-3.11/Pipfile Outdated Show resolved Hide resolved
base/habana-python-3.11/Dockerfile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
base/habana-python-3.11/Dockerfile Outdated Show resolved Hide resolved
base/habana-python-3.11/Dockerfile Outdated Show resolved Hide resolved
@Xaenalt
Copy link
Member

Xaenalt commented Sep 3, 2024

Yay, glad to see someone looked at the approach in my other PR. Yeah, the intent is to make upgrading very smooth since as long as they remain on the same Python version, upgrades of the Gaudi software stack should be as easy as swapping the base image out

@atheo89
Copy link
Member

atheo89 commented Sep 9, 2024

Hey @daniellutz Could you please point me to the generated image for Habana 1.17 to take a look?
I can not find it here https://github.com/daniellutz/notebooks/pkgs/container/notebooks%2Fworkbench-images/versions?filters%5Bversion_type%5D=tagged

Moreover, you will need to open a PR on ocp-ci on *-main.yaml for now, and add build - push and test configurations (The links provided are examples of how we handled it in version 1.13.)

Copy link
Contributor

openshift-ci bot commented Sep 9, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from atheo89. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@daniellutz
Copy link
Contributor Author

Hey @daniellutz Could you please point me to the generated image for Habana 1.17 to take a look? I can not find it here https://github.com/daniellutz/notebooks/pkgs/container/notebooks%2Fworkbench-images/versions?filters%5Bversion_type%5D=tagged

Moreover, you will need to open a PR on ocp-ci on *-main.yaml for now, and add build - push and test configurations (The links provided are examples of how we handled it in version 1.13.)

They are available on the following quay, for my tests:

https://quay.io/repository/dlutz/workbench-images?tab=tags

I pushed the changes, so soon they might be available in the Quay for the notebooks repository images as well.

USER root

## Label the image with details required by ODH
LABEL name="odh-notebook-habana-jupyter-1.17.1-ubi9-python-3.11" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether the naming isn't a bit confusing here; let's have the version right after habana word?

Suggested change
LABEL name="odh-notebook-habana-jupyter-1.17.1-ubi9-python-3.11" \
LABEL name="odh-notebook-habana-1.17.1-jupyter-ubi9-python-3.11" \

similarly for other cases?

# - https://github.com/HabanaAI/Setup_and_Install/blob/r1.17.1/dockerfiles/pytorch/Dockerfile.rhel9.4
FROM vault.habana.ai/gaudi-docker/1.17.1/rhel9.4/habanalabs/pytorch-installer-2.3.1:1.17.1-40 as base

LABEL name="odh-notebook-base-habana-python-3.11" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should include the habana version also in these labels here?

@jstourac
Copy link
Member

Thank you for the updates, Daniel. I commented just on some of the threads; other changes LGTM. I would like to give this a more time with one more full review again a bit later.

Also, to get rid of the CI failures, I think all what you need to do is to update the .github/workflows/build-notebooks.yaml as this advice you to do so.

@atheo89
Copy link
Member

atheo89 commented Sep 16, 2024

Hey Daniel! Looks better now in one single Dockerfile, with this we will have less maintenance in the future 🙂

Just two things:

  • Add on the PR description the latest Habana build image in order to inspect it (currently there is tag 15days ago and the last changes was last week 🤣 )
  • Will need to open a PR on ocp-ci on *-main.yaml for now, and add build - push and test configurations (The links provided are examples of how we handled it in version 1.13.) once it is ready we will move it on 2024b

- Add structure for Habana 1.17.1 image
- Fix reference to Python 3.10
- Add Habana 1.17.1 Python 3.11 base image to Makefile
- Fix references in README files from older Habana images
- Fix references to Habana official documentation
- Fix README for Habana 1.17.1
- Fix Habana Vault container image reference
- Fix Habana 1.17.1's Pipfile and Pipfile.lock files
- Add Habana 1.17.1 to Makefile's refresh-pipfilelock-files function
- Change entrypoint's chmod for Habana 1.17.1
- Fix Makefile call for Habana 1.17.1
- Add missing newline at end of Dockerfile
- Add Habana 1.17.1 images to Github Actions matrix jobs
- Add instructions to run the build and push commands for Habana 1.17.1
Copy link
Contributor

openshift-ci bot commented Sep 17, 2024

@daniellutz: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/runtimes-ubi9-e2e-tests 3856c94 link true /test runtimes-ubi9-e2e-tests
ci/prow/codeserver-notebook-e2e-tests 3856c94 link true /test codeserver-notebook-e2e-tests
ci/prow/notebooks-ubi9-e2e-tests 3856c94 link true /test notebooks-ubi9-e2e-tests
ci/prow/habana-notebooks-e2e-tests a450e92 link true /test habana-notebooks-e2e-tests
ci/prow/images a450e92 link true /test images

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants