Deka GPU Documentations
  • Starter Guide
    • Introduction
    • Sign Up
    • Choose a Package
    • Top Up
    • Create a Virtual Machine
    • Download kubeconfig
    • Create a Deka LLM
    • Create a Deka Notebook
    • Conclusion
  • Service Portal
    • Introduction
    • Sign Up
    • Sign In
    • Sign Out
    • Forgot Password
    • Account Setting
      • Using MFA Google Authenticator
      • Using MFA Microsoft Authenticator
    • Project
      • Add Project
      • Delete Project
    • List Roles
    • Broadcast
    • Audit Log
    • Voucher
    • Security
      • AI Security AI Infrastructure Layer
      • AI Security AI Application Layer
    • Ticket
      • Create Ticket
      • Detail Ticket
    • Billing
      • Daily Cost Estimated
      • Monthly Cost
      • Invoice
      • Summary Monthly
    • Balance
      • Project Type: SME
        • GPU Merdeka
        • Choose Package
        • Top-Up
      • Project Type: Enterprise
      • History Balance
        • Balance
        • Transaction
      • Custom Resource Definition
  • Deka GPU
    • Deka GPU: Kubernetes
      • Introduction
      • GPU Type
      • Dashboard
        • Check Status Kubernetes
        • Download Kube Config
        • Access Console
      • Workloads
        • Pods
          • Create New Pod
          • Access Console
          • Configuration Pod
          • Delete Pod
          • How to Create a New Pod use CLI
        • Deployments
          • Create New Deployment
          • Configuring Deployment
          • Delete of a Deployment
          • How to Create a New Deployment use CLI
        • DaemonSets
          • Create a New DaemonSet
          • Configuring a DaemonSet
          • Delete DaemonSet
      • Services
      • Storages
        • Storage Class
        • Persistent Volume Claims
          • Create a New Persistent Volume Claim
          • How to Create a New Persistent Volume Claim use CLI
    • Deka GPU: VMs
      • Operating System
      • GPU Type
      • Machine Type
      • Namespace Type
      • Storage Class
      • How to Create a Virtual Machine on Service Portal
      • How to Manually Create a Virtual Machine
        • Download Kube Config
        • Running Kube Config
        • Configuration file dv.yaml
        • Configuration file vm.yaml
        • Configuration file svc.yaml
      • Feature Overview of Virtual Machine
        • Detail a Virtual Machine
        • Open Console
        • Turn Off a VM Instance
        • Turn On a VM Instance
        • Restart a Virtual Machine
        • How to Access Console
        • Show YAML File
      • Delete a Virtual Machine
    • Deka GPU: Registry
      • Create Registry
      • Quota
      • Detail Registry
        • Summary
        • Repository
        • Logs
        • Labels
        • Tag Immutability
        • Member
        • Resize Storage Registry
      • Delete Registry
    • Deka GPU: Security
      • Deka Guard
        • Introduction
        • Create Guard to Deny All Ingress
        • Create Guard to Allow Ingress
        • Create Guard to Allow Ingress with port
        • Create Guard to Allow Ingress with IP/CIDR
        • Create Guard to Deny All Egress
        • Create Guard to Allow Egress
        • Create guard to Allow Egress with Port
        • Create Guard to Allow Egress with IP/CIDR
    • Deka GPU: Service
      • Ingress
        • Install Ingress nginx
        • Install Cert Manager
        • Create Cluster Issuer
        • Create Ingress with TLS
    • Deka GPU: Autoscaling
      • Basic Autoscaling
    • Deka GPU: Network
      • Deka VPC
    • Deka GPU: MLOps
      • Introduction
      • Notebook
      • Tensorboards
      • Volumes
      • Endpoints
        • Create Endpoint
        • Delete Endpoint
      • Experiments (AutoML)
        • Create Experiments (AutoML)
        • Create Experiments (AutoML) using Python SDK
        • Get Experiments Results
      • Experiments (KFP)
        • Create Experiment
      • Pipelines
      • Runs
        • Create Run
        • Delete Active Run
      • Recurring Runs
        • Create Recurring Run
        • Delete Recurring Runs
        • Home
      • Artifacts
      • Executions
      • Manage Contributors
  • Deka LLM
    • Introduction
    • Check Project Type
    • Create a New LLM
    • Detail Deka LLM
      • Overview Tab
      • Keys Tab
        • Create a New Key
        • Detail a Key
        • Edit a Key
        • Get a Secret Key
        • Delete a Key
      • Usage Tab
      • Top Up Coin
    • API Deka LLM
      • Model Management
      • Completions
      • Embedding
    • Delete Deka LLM
    • How to Create Simple Prompt with Deka LLM
      • Create Deka LLM
      • Get URL API Deka LLM
      • Get Secret Key
      • Access API Deka LLM using Postman
      • Get Model
      • Post Chat Completions
  • Deka Notebook
    • Introduction
    • Namespace Type
    • Create a New Notebook
    • Detail Deka Notebook
      • Configuration Deka Notebook
      • Start Deka Notebook Service
      • Stop Deka Notebook Service
      • Get Token
      • Login Deka Notebook
      • Logout Deka Notebook
    • Delete Deka Notebook
  • Reference
    • How to use kubeconfig on Linux
    • How to use kubeconfig on Windows
    • Kubernetes Commands for Enhancing Security
    • How to add GPU in Kubernetes
    • How to Add GPU in VM
      • Download kubeconfig
      • Install kubectl
      • Add GPU
      • Install Driver NVIDIA
    • RAPIDS
      • How to Setup RAPIDS
      • How to make Custom Image
    • How to push image with Docker
    • Deployment LLaMA 3.1 70B with VLLM on Kubernetes
      • Getting the Hugging Face API Key
      • Requesting Access to the LLaMA Model
      • Connect Kubernetes on Computer
      • Create Namespace
      • Create PersistentVolumeClaim (PVC)
      • Create Secret for Hugging Face Token
      • Create Deployment
      • Create Service
      • Verify Deployment
      • Accessing the LLaMA Service
      • Troubleshooting
    • How to Get an API Key on NGC
    • Deployment LLM with Deka GPU + NIM
    • Deployment Deepseek R1 70B with VLLM on Deka GPU's Kubernetes
      • Prerequisites
      • Create Namespace
      • Create PersistentVolumeClaim (PVC)
      • Create Deployment
      • Create Service
      • Verify Deployment
      • Accessing the Deepsek Service
      • Troubleshooting
    • How to Upload and Download on FTP Web
  • Troubleshooting
    • Reinstall Driver NVIDIA on Linux
    • NVIDIA Driver Not Detected After Upgrade Kernel
Powered by GitBook
On this page
  • Download Helm Chart
  • Add Runtime NVIDIA
  • Install Helm Chart
  • Testing
  1. Reference

Deployment LLM with Deka GPU + NIM

PreviousHow to Get an API Key on NGCNextDeployment Deepseek R1 70B with VLLM on Deka GPU's Kubernetes

Last updated 1 month ago

Pre-requisite

  1. Contact dekagpu.support@lintasarta.co.id to get licensed for NVIDIA AI Enterprise.

  2. You need NGC API Keys, click .

  3. Contact dekagpu.support@lintasarta.co.id to get root privileged access.

Download Helm Chart

After you have fulfilled the prerequisites above, the next step is to download Helm Chart by executing the following syntax.

helm fetch https://helm.ngc.nvidia.com/nim/charts/nim-llm-1.3.0.tgz --username=\$oauthtoken --password=<your token> 

After successfully downloading Helm Chart, the next step is to extract the file by executing the following syntax.

tar -xvzf nim-llm-1.3.0.tgz 

Enter the extracted folder by running the syntax below.

cd nim-llm

Change the security context in the values.yaml file

Determine the base model used, in this guide using meta llama 8b.

  • On the NVIDIA website, in the Search API Catalog column, enter the name of the model that will be used.

  • The Search API Catalog column displays several models, and select the model to use.

  • On the selected model page there is an Experience tab.

  • On the Experience tab on the right there are several options, namely Python, LangChain, Node, Shell, and Docker. Select Docker.

  • In the Docker section under "Pull and run the NVIDIA NIM with the command below. This will download the optimized model for your infrastructure." in the bottom row there is the model URL. Copy the URL.

Edit the gpu based on the needs.

Put your api key in line "ngcAPI Key".

Edit the storage based on the needs.

Save the file after updating, press CTRL + O to save, and CTRL + X to exit.

Add Runtime NVIDIA

Open the deployment.yaml file in the files folder by executing the following syntax.

nano files/deployment.yaml

After the deployment.yaml file opens, look for the specsection and add runtimeClassName: nvidia according to the image below.

Save the file after updating, press CTRL + O to save, and CTRL + X to exit.

Install Helm Chart

Install Helm Chart by executing the following syntax.

helm install nim . -f values.yaml 

Wait until the installation process is complete and run the following syntax to see the list of available pods.

kubectl get pods

Testing

When finished, run the following syntax to display a list of nim services in the namespace.

kubectl get svc -n nim

In the image above, you can see nim-llm using the IP cluster 10.250.225.0 and the status is active. Run the following syntax to log into a container running inside a pod in Kubernetes.

kubectl exec -it multitoll -- bash

Add link model

If you want use other model, you can refer to web page. The following are the steps for getting the URL for the model that will be used.

Edit Storage
Add runtimeClassName
Install Helm Chart
NVDIA
this link
Testing
Page cover image