Deka GPU Documentations
  • Starter Guide
    • Introduction
    • Sign Up
    • Choose a Package
    • Top Up
    • Create a Virtual Machine
    • Download kubeconfig
    • Create a Deka LLM
    • Create a Deka Notebook
    • Conclusion
  • Service Portal
    • Introduction
    • Sign Up
    • Sign In
    • Sign Out
    • Forgot Password
    • Account Setting
      • Using MFA Google Authenticator
      • Using MFA Microsoft Authenticator
    • Project
      • Add Project
      • Delete Project
    • List Roles
    • Broadcast
    • Audit Log
    • Voucher
    • Security
      • AI Security AI Infrastructure Layer
      • AI Security AI Application Layer
    • Ticket
      • Create Ticket
      • Detail Ticket
    • Billing
      • Daily Cost Estimated
      • Monthly Cost
      • Invoice
      • Summary Monthly
    • Balance
      • Project Type: SME
        • GPU Merdeka
        • Choose Package
        • Top-Up
      • Project Type: Enterprise
      • History Balance
        • Balance
        • Transaction
      • Custom Resource Definition
  • Deka GPU
    • Deka GPU: Kubernetes
      • Introduction
      • GPU Type
      • Dashboard
        • Check Status Kubernetes
        • Download Kube Config
        • Access Console
      • Workloads
        • Pods
          • Create New Pod
          • Access Console
          • Configuration Pod
          • Delete Pod
          • How to Create a New Pod use CLI
        • Deployments
          • Create New Deployment
          • Configuring Deployment
          • Delete of a Deployment
          • How to Create a New Deployment use CLI
        • DaemonSets
          • Create a New DaemonSet
          • Configuring a DaemonSet
          • Delete DaemonSet
      • Services
      • Storages
        • Storage Class
        • Persistent Volume Claims
          • Create a New Persistent Volume Claim
          • How to Create a New Persistent Volume Claim use CLI
    • Deka GPU: VMs
      • Operating System
      • GPU Type
      • Machine Type
      • Namespace Type
      • Storage Class
      • How to Create a Virtual Machine on Service Portal
      • How to Manually Create a Virtual Machine
        • Download Kube Config
        • Running Kube Config
        • Configuration file dv.yaml
        • Configuration file vm.yaml
        • Configuration file svc.yaml
      • Feature Overview of Virtual Machine
        • Detail a Virtual Machine
        • Open Console
        • Turn Off a VM Instance
        • Turn On a VM Instance
        • Restart a Virtual Machine
        • How to Access Console
        • Show YAML File
      • Delete a Virtual Machine
    • Deka GPU: Registry
      • Create Registry
      • Quota
      • Detail Registry
        • Summary
        • Repository
        • Logs
        • Labels
        • Tag Immutability
        • Member
        • Resize Storage Registry
      • Delete Registry
    • Deka GPU: Security
      • Deka Guard
        • Introduction
        • Create Guard to Deny All Ingress
        • Create Guard to Allow Ingress
        • Create Guard to Allow Ingress with port
        • Create Guard to Allow Ingress with IP/CIDR
        • Create Guard to Deny All Egress
        • Create Guard to Allow Egress
        • Create guard to Allow Egress with Port
        • Create Guard to Allow Egress with IP/CIDR
    • Deka GPU: Service
      • Ingress
        • Install Ingress nginx
        • Install Cert Manager
        • Create Cluster Issuer
        • Create Ingress with TLS
    • Deka GPU: Autoscaling
      • Basic Autoscaling
    • Deka GPU: Network
      • Deka VPC
    • Deka GPU: MLOps
      • Introduction
      • Notebook
      • Tensorboards
      • Volumes
      • Endpoints
        • Create Endpoint
        • Delete Endpoint
      • Experiments (AutoML)
        • Create Experiments (AutoML)
        • Create Experiments (AutoML) using Python SDK
        • Get Experiments Results
      • Experiments (KFP)
        • Create Experiment
      • Pipelines
      • Runs
        • Create Run
        • Delete Active Run
      • Recurring Runs
        • Create Recurring Run
        • Delete Recurring Runs
        • Home
      • Artifacts
      • Executions
      • Manage Contributors
  • Deka LLM
    • Introduction
    • Check Project Type
    • Create a New LLM
    • Detail Deka LLM
      • Overview Tab
      • Keys Tab
        • Create a New Key
        • Detail a Key
        • Edit a Key
        • Get a Secret Key
        • Delete a Key
      • Usage Tab
      • Top Up Coin
    • API Deka LLM
      • Model Management
      • Completions
      • Embedding
    • Delete Deka LLM
    • How to Create Simple Prompt with Deka LLM
      • Create Deka LLM
      • Get URL API Deka LLM
      • Get Secret Key
      • Access API Deka LLM using Postman
      • Get Model
      • Post Chat Completions
  • Deka Notebook
    • Introduction
    • Namespace Type
    • Create a New Notebook
    • Detail Deka Notebook
      • Configuration Deka Notebook
      • Start Deka Notebook Service
      • Stop Deka Notebook Service
      • Get Token
      • Login Deka Notebook
      • Logout Deka Notebook
    • Delete Deka Notebook
  • Reference
    • How to use kubeconfig on Linux
    • How to use kubeconfig on Windows
    • Kubernetes Commands for Enhancing Security
    • How to add GPU in Kubernetes
    • How to Add GPU in VM
      • Download kubeconfig
      • Install kubectl
      • Add GPU
      • Install Driver NVIDIA
    • RAPIDS
      • How to Setup RAPIDS
      • How to make Custom Image
    • How to push image with Docker
    • Deployment LLaMA 3.1 70B with VLLM on Kubernetes
      • Getting the Hugging Face API Key
      • Requesting Access to the LLaMA Model
      • Connect Kubernetes on Computer
      • Create Namespace
      • Create PersistentVolumeClaim (PVC)
      • Create Secret for Hugging Face Token
      • Create Deployment
      • Create Service
      • Verify Deployment
      • Accessing the LLaMA Service
      • Troubleshooting
    • How to Get an API Key on NGC
    • Deployment LLM with Deka GPU + NIM
    • Deployment Deepseek R1 70B with VLLM on Deka GPU's Kubernetes
      • Prerequisites
      • Create Namespace
      • Create PersistentVolumeClaim (PVC)
      • Create Deployment
      • Create Service
      • Verify Deployment
      • Accessing the Deepsek Service
      • Troubleshooting
    • How to Upload and Download on FTP Web
  • Troubleshooting
    • Reinstall Driver NVIDIA on Linux
    • NVIDIA Driver Not Detected After Upgrade Kernel
Powered by GitBook
On this page
  • Autoscaling Based on Resources Utilization
  • Autoscaling based on resources value
  1. Deka GPU
  2. Deka GPU: Autoscaling

Basic Autoscaling

Autoscaling Based on Resources Utilization

Autoscaling based on resource utilization means that HPA (Horizontal Pod Autoscaler) adjusts the number of replica pods based on the percentage of resource usage (CPU or memory) relative to the requests specified in the Pod spec. You can first run the following syntax to create a YAML file named hpa.

nano hpa.yaml

Copy and paste the following YAML contents.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-samplepod-cpu
  namespace: hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: samplepod
  minReplicas: 1
  maxReplicas: 5
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: mem
        target:
          type: Utilization
          averageUtilization: 60
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15
        - type: Pods
          value: 4
          periodSeconds: 15
      selectPolicy: Max

After pasting the syntax, save the file by pressing Ctrl + O, then press the Enter key and exit the editor by pressing Ctrl + X. The next step is to apply this configuration.

kubectl apply -f hpa.yaml

Autoscaling based on resources value

Autoscaling based on resource value means that HPA adjusts the number of replica pods based on the absolute value of the resources used (CPU/memory) without comparing it to the request. You can first run the following syntax to create a YAML file named hpa.

nano hpa.yaml

Copy and paste the following YAML contents.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-samplepod-cpu
  namespace: hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: samplepod
  minReplicas: 1
  maxReplicas: 5
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: AverageValue
          averageValue: 500m
    - type: Resource
      resource:
        name: mem
        target:
          type: AverageValue
          averageValue: 512Mi
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      poli

After pasting the syntax, save the file by pressing Ctrl + O, then press the Enter key and exit the editor by pressing Ctrl + X. The next step is to apply this configuration

kubectl apply -f hpa.yaml
PreviousDeka GPU: AutoscalingNextDeka GPU: Network

Last updated 2 months ago

Page cover image