Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rapids to 23.06 #307

Merged
merged 7 commits into from
Jul 10, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions ci/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-py38_4.10.3-Linu
&& conda init

# install cuML
ARG CUML_VER=23.04
ARG CUML_VER=23.06
RUN conda install -c conda-forge mamba && \
mamba install -y -c rapidsai -c nvidia -c conda-forge cuml=$CUML_VER python=3.8 cuda-toolkit=11.5 \
mamba install -y -c rapidsai -c nvidia -c conda-forge cuml=$CUML_VER python=3.9 cuda-toolkit=11.5 \
&& mamba clean --all -f -y
4 changes: 2 additions & 2 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ RUN wget --quiet \
# install cuDF dependency, Fall back to use cudf 22.04 due to issue:
# https://github.com/NVIDIA/spark-rapids-ml/issues/73
ARG CONDA_CUDF_VER=22.04
lijinf2 marked this conversation as resolved.
Show resolved Hide resolved
RUN conda install -c rapidsai -c conda-forge cudf=$CONDA_CUDF_VER python=3.8 -y
RUN conda install -c rapidsai -c conda-forge cudf=$CONDA_CUDF_VER python=3.9 -y

# Note: the raft verion is fixed to 22.12, do not modify it when updating the spark-rapids-ml version.
# newer versions may fail the build process due to API incompatibility.
Expand All @@ -98,7 +98,7 @@ ENV RAFT_PATH=/raft

### END OF CACHE ###

#ARG RAPIDS_ML_VER=23.04
#ARG RAPIDS_ML_VER=23.06
#RUN git clone -b branch-$RAPIDS_ML_VER https://github.com/NVIDIA/spark-rapids-ml.git
COPY . /spark-rapids-ml
WORKDIR /spark-rapids-ml/jvm
Expand Down
2 changes: 1 addition & 1 deletion docker/Dockerfile.pip
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ ARG CUDA_VERSION=11.8.0
FROM nvidia/cuda:${CUDA_VERSION}-devel-ubuntu20.04

ARG PYSPARK_VERSION=3.3.1
ARG RAPIDS_VERSION=23.4.0
ARG RAPIDS_VERSION=23.6.0

# Install packages to build spark-rapids-ml
RUN apt-get update -y \
Expand Down
4 changes: 2 additions & 2 deletions docker/Dockerfile.python
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
ARG CUDA_VERSION=11.5.2
FROM nvidia/cuda:${CUDA_VERSION}-devel-ubuntu20.04

ARG CUML_VERSION=23.04
ARG CUML_VERSION=23.06

# Install packages to build spark-rapids-ml
RUN apt update -y \
Expand All @@ -38,7 +38,7 @@ RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-py38_4.10.3-Linu

# install cuML

RUN conda install -y -c rapidsai -c nvidia -c conda-forge python=3.8 cuda-toolkit=11.5 cuml=$CUML_VERSION \
RUN conda install -y -c rapidsai -c nvidia -c conda-forge python=3.9 cuda-toolkit=11.5 cuml=$CUML_VERSION \
&& conda clean --all -f -y

# install python dependencies
Expand Down
2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
project = 'spark-rapids-ml'
copyright = '2023, NVIDIA'
author = 'NVIDIA'
release = '23.4.0'
release = '23.6.0'

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
Expand Down
2 changes: 1 addition & 1 deletion notebooks/aws-emr/init-bootstrap-action.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ sudo chmod a+rwx -R /sys/fs/cgroup/devices
sudo yum install -y gcc openssl-devel bzip2-devel libffi-devel tar gzip wget make mysql-devel
sudo bash -c "wget https://www.python.org/ftp/python/3.9.9/Python-3.9.9.tgz && tar xzf Python-3.9.9.tgz && cd Python-3.9.9 && ./configure --enable-optimizations && make altinstall"

RAPIDS_VERSION=23.4.0
RAPIDS_VERSION=23.6.0

# install scikit-learn
sudo /usr/local/bin/pip3.9 install scikit-learn
Expand Down
8 changes: 4 additions & 4 deletions notebooks/databricks/init-pip-cuda-11.8.sh
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
#!/bin/bash
# set portion of path below after /dbfs/ to dbfs zip file location
SPARK_RAPIDS_ML_ZIP=/dbfs/path/to/zip/file
# IMPORTANT: specify RAPIDS_VERSION fully 23.4.0 and not 23.4
# also RAPIDS_VERSION (python) fields should omit any leading 0 in month/minor field (i.e. 23.4.0 and not 23.04.0)
# while SPARK_RAPIDS_VERSION (jar) should have leading 0 in month/minor (e.g. 23.04.0 and not 23.4.0)
RAPIDS_VERSION=23.4.0
# IMPORTANT: specify RAPIDS_VERSION fully 23.6.0 and not 23.6
# also RAPIDS_VERSION (python) fields should omit any leading 0 in month/minor field (i.e. 23.6.0 and not 23.06.0)
# while SPARK_RAPIDS_VERSION (jar) should have leading 0 in month/minor (e.g. 23.06.0 and not 23.6.0)
RAPIDS_VERSION=23.6.0
SPARK_RAPIDS_VERSION=23.04.0

curl -L https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/${SPARK_RAPIDS_VERSION}/rapids-4-spark_2.12-${SPARK_RAPIDS_VERSION}.jar -o /databricks/jars/rapids-4-spark_2.12-${SPARK_RAPIDS_VERSION}.jar
Expand Down
2 changes: 1 addition & 1 deletion notebooks/dataproc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ If you already have a Dataproc account, you can run the example notebooks on a D
- Create a cluster with at least two single-gpu workers. **Note**: in addition to the initialization script from above, this also uses the standard [initialization actions](https://github.com/GoogleCloudDataproc/initialization-actions) for installing the GPU drivers and RAPIDS:
```
export CUDA_VERSION=11.8
export RAPIDS_VERSION=23.4
export RAPIDS_VERSION=23.6

gcloud dataproc clusters create $USER-spark-rapids-ml \
--image-version=2.0.29-ubuntu18 \
Expand Down
2 changes: 1 addition & 1 deletion notebooks/dataproc/spark_rapids_ml.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash

RAPIDS_VERSION=23.4.0
RAPIDS_VERSION=23.6.0

# patch existing packages
mamba install "llvmlite<0.40,>=0.39.0dev0" "numba>=0.56.2"
Expand Down
6 changes: 3 additions & 3 deletions python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@ For simplicity, the following instructions just use Spark local mode, assuming a

First, install RAPIDS cuML per [these instructions](https://rapids.ai/start.html).
```bash
conda create -n rapids-23.04 \
conda create -n rapids-23.06 \
-c rapidsai -c nvidia -c conda-forge \
cuml=23.04 python=3.8 cudatoolkit=11.5
cuml=23.06 python=3.9 cudatoolkit=11.5
```

**Note**: while testing, we recommend using conda or docker to simplify installation and isolate your environment while experimenting. Once you have a working environment, you can then try installing directly, if necessary.
Expand All @@ -19,7 +19,7 @@ conda create -n rapids-23.04 \

Once you have the conda environment, activate it and install the required packages.
```bash
conda activate rapids-23.04
conda activate rapids-23.06

# for development access to notebooks, tests, and benchmarks
git clone --branch main https://github.com/NVIDIA/spark-rapids-ml.git
Expand Down
8 changes: 4 additions & 4 deletions python/benchmark/databricks/init-pip-cuda-11.8.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@
# set portion of path below after /dbfs/ to dbfs zip file location
SPARK_RAPIDS_ML_ZIP=/dbfs/path/to/spark-rapids-ml.zip
BENCHMARK_ZIP=/dbfs/path/to/benchmark.zip
# IMPORTANT: specify rapids fully 23.4.0 and not 23.4
# also RAPIDS_VERSION (python) fields should omit any leading 0 in month/minor field (i.e. 23.4.0 and not 23.04.0)
# while SPARK_RAPIDS_VERSION (jar) should have leading 0 in month/minor (e.g. 23.04.0 and not 23.4.0)
RAPIDS_VERSION=23.4.0
# IMPORTANT: specify rapids fully 23.6.0 and not 23.6
# also RAPIDS_VERSION (python) fields should omit any leading 0 in month/minor field (i.e. 23.6.0 and not 23.06.0)
# while SPARK_RAPIDS_VERSION (jar) should have leading 0 in month/minor (e.g. 23.06.0 and not 23.6.0)
RAPIDS_VERSION=23.6.0
SPARK_RAPIDS_VERSION=23.04.0
leewyang marked this conversation as resolved.
Show resolved Hide resolved

curl -L https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/${SPARK_RAPIDS_VERSION}/rapids-4-spark_2.12-${SPARK_RAPIDS_VERSION}.jar -o /databricks/jars/rapids-4-spark_2.12-${SPARK_RAPIDS_VERSION}.jar
Expand Down
2 changes: 1 addition & 1 deletion python/benchmark/dataproc/init_benchmark.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ function get_metadata_attribute() {
/usr/share/google/get_metadata_value "attributes/${attribute_name}" || echo -n "${default_value}"
}

RAPIDS_VERSION=$(get_metadata_attribute rapids-version 23.4.0)
RAPIDS_VERSION=$(get_metadata_attribute rapids-version 23.6.0)

# patch existing packages
mamba install "llvmlite<0.40,>=0.39.0dev0" "numba>=0.56.2"
Expand Down
4 changes: 2 additions & 2 deletions python/benchmark/dataproc/start_cluster.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ fi

BENCHMARK_HOME=${BENCHMARK_HOME:-${GCS_BUCKET}/benchmark}
CUDA_VERSION=${CUDA_VERSION:-11.8}
RAPIDS_VERSION=${RAPIDS_VERSION:-23.4.0}
RAPIDS_VERSION=${RAPIDS_VERSION:-23.6.0}

gpu_args=$(cat <<EOF
--master-accelerator type=nvidia-tesla-t4,count=1
Expand Down Expand Up @@ -51,7 +51,7 @@ if [[ $? == 0 ]]; then
else
set -x
gcloud dataproc clusters create ${cluster_name} \
--image-version=2.0.29-ubuntu18 \
--image-version=2.1-ubuntu18 \
--region ${COMPUTE_REGION} \
--master-machine-type n1-standard-16 \
--num-workers 2 \
Expand Down
2 changes: 1 addition & 1 deletion python/pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "spark-rapids-ml"
version = "23.4.0"
version = "23.6.0"
authors = [
{ name="Jinfeng Li", email="[email protected]" },
{ name="Bobby Wang", email="[email protected]" },
Expand Down
2 changes: 1 addition & 1 deletion python/src/spark_rapids_ml/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#
__version__ = "23.4.0"
__version__ = "23.6.0"