# Deployment LLM with Deka GPU + NIM

{% hint style="warning" %}
Pre-requisite

1. Contact <dekagpu.support@lintasarta.co.id> to get licensed for NVIDIA AI Enterprise.&#x20;
2. You need NGC API Keys, click [this link](https://docs.cloudeka.ai/miscellaneous/reference/how-to-get-an-api-key-on-ngc).&#x20;
3. Contact <dekagpu.support@lintasarta.co.id> to get root privileged access.&#x20;
   {% endhint %}

## Download Helm Chart

After you have fulfilled the prerequisites above, the next step is to download Helm Chart by executing the following syntax.

```bash
helm fetch https://helm.ngc.nvidia.com/nim/charts/nim-llm-1.3.0.tgz --username=\$oauthtoken --password=<your token> 
```

After successfully downloading Helm Chart, the next step is to extract the file by executing the following syntax.

```bash
tar -xvzf nim-llm-1.3.0.tgz 
```

Enter the extracted folder by running the syntax below.

```bash
cd nim-llm
```

Change the security context in the values.yaml file

<figure><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXeTiP-ZUHPnMUBisFmP5e_3dYwr2CGoEf1vX0fPSFJ1hpjha1TdAJgtqiz1yQ1ZsxJ2DKe1kLgiyPqwv0IewpnAmJE96bJCkWjIyKTx7YUV2EQ0_RCGYAtWoAkPIvWsbp-BZH_fJA?key=uZw7Ciu7pZmmC8Wc1kuhRFnS" alt=""><figcaption></figcaption></figure>

Determine the base model used, in this guide using meta llama 8b.&#x20;

<figure><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXd2yTBHCwhxWaU11OnjS878SsS90AlSnB4KUc3Bm-ugTEHtfnr3Gk6ar9KTHgWwVhDxspnK_rYkQjJcz3Tg0K9Vb-KI5cbwg6da-TVGySI2VTgBX2bMgrEeYesMLmo8BUD3-gxn?key=uZw7Ciu7pZmmC8Wc1kuhRFnS" alt=""><figcaption><p>Add link model</p></figcaption></figure>

{% hint style="success" %}
If you want use other model, you can refer to [NVDIA](https://build.nvidia.com/) web page. The following are the steps for getting the URL for the model that will be used.

* On the NVIDIA website, in the Search API Catalog column, enter the name of the model that will be used.
* The Search API Catalog column displays several models, and select the model to use.
* On the selected model page there is an Experience tab.
* On the Experience tab on the right there are several options, namely Python, LangChain, Node, Shell, and Docker. Select Docker.
* In the Docker section under "Pull and run the NVIDIA NIM with the command below. This will download the optimized model for your infrastructure." in the bottom row there is the model URL. Copy the URL.

<img src="https://2882153758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fi9YWb69HFXLHYlXffReU%2Fuploads%2F1Xf0m60tJJIY5QxqzqqZ%2Fimage.png?alt=media&#x26;token=54a7ff5e-9327-47e9-a118-a1e628e5acb4" alt="" data-size="original">
{% endhint %}

Edit the gpu based on the needs.

<figure><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXeznjsM0jH56uO29wM34QIlGlvR_pQIx_p2kKCu80L19wrQUXsy53lRQPlutSHJ_TN7JJTUfnQ93WqavSK-PwPV7sh0z9MkhNCMMXLUAKVj1v-9fZBySL3wKyz0yZSMIsqIJM4Ecw?key=uZw7Ciu7pZmmC8Wc1kuhRFnS" alt=""><figcaption></figcaption></figure>

&#x20;Put your api key in line "ngcAPI Key".&#x20;

<figure><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXfA3w2wWiwZKjaFXp5ZfKT-bPbGDBZ6bOp2zrtvhW-RGgREWjrbxje3nVMCf2MRtEwF123maoeFrUEpAfeiaLme5_6wHnXKixyo08QI_LExo5yIEEmq6Edrxzw6SWJzjO5Uq6ny?key=uZw7Ciu7pZmmC8Wc1kuhRFnS" alt=""><figcaption></figcaption></figure>

Edit the storage based on the needs.&#x20;

<figure><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXdmUJSnWqn5nuLW0saYmP-Bd-q49tHXO2hpWT25yfo4zdYDrcY424WsK6MLCjTi8ZkUs8mNwMA83Sstuj8olFPpwE6-YDe79QpjKMBwusht9dLLzdU9JPZBdWdeRze1BEkFiuOE?key=uZw7Ciu7pZmmC8Wc1kuhRFnS" alt=""><figcaption><p>Edit Storage</p></figcaption></figure>

Save the file after updating, press CTRL + O to save, and CTRL + X to exit.

## Add Runtime NVIDIA&#x20;

Open the deployment.yaml file in the files folder by executing the following syntax.

```bash
nano files/deployment.yaml
```

After the deployment.yaml file opens, look for the `spec`section and add `runtimeClassName: nvidia` according to the image below.

<figure><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXecWDoTXlnEo0KQBLTPyaekmktvQVNuyHcIJzEg7YVn4eQZ0GoFoOVOF9Z6aa3HHEzRTnbYbf5qb0kJSKC7KvlRm3DKf7Q7CvODijhaziBRyCNuuwKnJ3J4K-J8OWgN-L6UGg3ZDA?key=uZw7Ciu7pZmmC8Wc1kuhRFnS" alt=""><figcaption><p>Add runtimeClassName</p></figcaption></figure>

Save the file after updating, press CTRL + O to save, and CTRL + X to exit.

## Install Helm Chart

Install Helm Chart by executing the following syntax.

```bash
helm install nim . -f values.yaml 
```

<figure><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXd61v-tTk0vSBT87ooDimIHtkHNcuY5TnNCjdwvax6CJ1FX8Tx1es29Sf_fOI5GEtwPfDArwgiFXTIIFiUSb2l6egLPMr8yV5Lud6X_L8Wl8hVTJ42NP1E6-hiuRM1XSL15KKKt?key=uZw7Ciu7pZmmC8Wc1kuhRFnS" alt=""><figcaption><p>Install Helm Chart</p></figcaption></figure>

Wait until the installation process is complete and run the following syntax to see the list of available pods.

```bash
kubectl get pods
```

<figure><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXcBcx9lIx9shcllsOPstd_Qd_J4w7VdbXy9EiBvluTXuoFh-oiaG5-izbBw4YcVeIrtq3ZKE1-rJRjshdiDoNp93i5fy_d61v5MaLx64k7C4WqH4ed88iKvSlsUduW6OOuQ2uP1?key=uZw7Ciu7pZmmC8Wc1kuhRFnS" alt=""><figcaption></figcaption></figure>

## Testing&#x20;

When finished, run the following syntax to display a list of nim services in the namespace.

```bash
kubectl get svc -n nim
```

<figure><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXeQtlVHZwtNb28oj7hs0ERc5W6ea0Ccms94O9pOikBD4oEYt6ZLtQYVK44FUc6dTKzxT0VfVy5kr9WQmr0U73hxKgB7QcejoQs_sLxDudE2mbLm5tiJWrcL7vCtilSrP0lC5F46jA?key=uZw7Ciu7pZmmC8Wc1kuhRFnS" alt=""><figcaption></figcaption></figure>

In the image above, you can see nim-llm using the IP cluster 10.250.225.0 and the status is active. Run the following syntax to log into a container running inside a pod in Kubernetes.

```bash
kubectl exec -it multitoll -- bash
```

<figure><img src="https://2882153758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fi9YWb69HFXLHYlXffReU%2Fuploads%2FrjnC3XD4ceFCnkFihYDc%2Fimage.png?alt=media&#x26;token=89ff1d79-43e5-428a-bc71-d52d75dc4ab3" alt=""><figcaption><p>Testing</p></figcaption></figure>
