This section of the guide will explain how to set up RAPIDS in the Deka GPU Portal Service. By using RAPIDS, operations that would normally run on the CPU can be significantly accelerated by running them on the GPU, providing improved performance in large-scale scenarios. The following are the steps for setting up RAPIDS in the Deka GPU Portal Service:
On the MLOps menu page on the left, select the Notebooks menu.
On the Notebooks page press the +New Notebooks button.
The New Notebooks page appears then fill in the name of the notebook and select Jupyterlab.
In the Custom Notebooks section you can use two options including:
Image
In the Image section of existing Custom Notebooks, select "Image", and select zhydnytrat/rapids:1.06
Custom Image
In the Custom Image section, select "Advanced Options", check the "Custom Image" section, and enter the name of the image repository that will be used along with the tag. For example, the name of the image repository used in the following link https://hub.docker.com/u/ndominic100 then in the custom image name section use something like this ndominic100/rapids-23.08:latest. For further explanation on creating a Custom Image, see the guide in the sub-chapter 4.13.2 How to make Custom Image.
In the CPU/RAM section, adjust it to your needs. Make sure the GPUs section in the Number of GPUs section is 1 and the GPU Vendor uses NVIDIA.
In the Workspace Volume section, determine the size that will be used, in the Access Mode section so that the volume can be used on several notebooks, select "ReadWriteMany" and in the Mount Path section, replace it with "/home/rapids/data".
In the Data Volumes section, make sure you have added a new volume and press the LAUNCH button to continue the RAPIDS creation process in the Deka GPU Portal Service.
Wait until the notebook creation process is complete and ready to use.
After you have successfully created RAPIDS in the Cloudeka Portal Service, for the next step you can add data processing that can be used. To be able to add processing data that will be used on the Notebooks page in the Deka GPU Portal Service, press the CONNECT button to run the Notebooks that were previously created.
Automatically, you will be directed to the notebooks server page, namely Jupyter, select Python 3 in the Notebooks section.
You can add a Data Frame that will be used in these notebooks.
On RAPIDS there are several data processing that can be used and explained in this guide include the following:
Data Processing-cuDF, used for operations such as merging, and filtering data. Main advantage cuDF is its ability to speed up big data processing by leveraging GPUs, thereby reducing data processing time significantly. For further explanation about Data Processing-cuDF, see this link. The following is an example of using Data Processing-cuDF:
When DataFrame is already using data processing cuDF then you can perform Pandas operations as usual. For other alternatives, you can use magic line so that there is no need to change all Pandas code to cuDF so that it will automatically switch to CPU if in cuDF there is no method defined by running the syntax below this.
If you often work with large datasets and are already familiar with pandas, this command is very useful for improving your data processing performance by taking advantage of the GPU.
Data Processing-cuGraph, to run graph analysis on a large scale using GPU. For further explanation about Data Processing-cuGraph, see this link. The following is an example of using Data Processing-cuGraph:
Data Processing-cuML, is a GPU-based machine learning library developed by NVIDIA as part of the RAPIDS ecosystem. cuML is designed to speed up machine learning workflows by utilizing the parallelism capabilities of GPUs, thus enabling faster data processing and model training compared to CPUs. It provides various algorithms such as linear regression, clustering, PCA, and many more. cuML is compatible with the APIs of the Scikit-Learn library, so users can easily migrate existing code to take advantage of GPU acceleration. For further explanation about Data Processing-cuML, see this link. The following is an example of using Data Processing-cuML: