# Create Service

Create a service to expose the LLaMA deployment.  If you are using a Linux operating system, then run the following syntax to create the service.yaml file.

```bash
nano service.yaml
```

If you are using a Windows operating system, open a text editor such as Notepad or Notepad++.

<figure><img src="/files/BsjbwkBGhwTv6O7RQ0QD" alt="" width="375"><figcaption><p>Text Editor</p></figcaption></figure>

Enter the following syntax.

```yaml
apiVersion: v1
kind: Service
metadata:
  name: llama-31-70b-instruct
  namespace: vllm
spec:
  ports:
  - name: http-vllm
    port: 80
    protocol: TCP
    targetPort: 8000
  selector:
    app: llama-31-70b-instruct
  sessionAffinity: None
  type: LoadBalancer
```

If you are using a **Linux** operating system, run the following syntax but If you are using a **Windows** operating system, after save the file as secret.yaml,  in CMD navigate to the folder that contains the secret.yaml file and run the following syntax.

```bash
kubectl apply -f service.yaml
```

<figure><img src="/files/H51G2imbVG37PQzvQTJX" alt=""><figcaption><p>Success Apply service.yaml</p></figcaption></figure>

{% hint style="warning" %}
To delete the secret.yaml configuration that has been applied, run the following syntax.&#x20;

```bash
kubectl delete -f secret.yaml -n [namespace]
```

**Replace \[namespace] with the namespace you created in the sub-chapter** [**Create Namespace**](/reference/deployment-llama-3.1-70b-with-vllm-on-kubernetes/create-namespace.md)**.**
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.cloudeka.ai/reference/deployment-llama-3.1-70b-with-vllm-on-kubernetes/create-service.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
