Accessing the LLaMA Service

Once the service is running and has an EXTERNAL-IP, you can access the service using the EXTERNAL-IP and port 80.

List Models

To list available models, use the following curl command.

curl -X GET http://<EXTERNAL-IP>/v1/models

Create a Completion

To create a completion, use the following curl command.

curl -X POST http://<EXTERNAL-IP>/v1/completions \
     -H "Content-Type: application/json" \
     -d '{
           "model": "meta-llama/Llama-3.1-70B-Instruct",
           "prompt": "Once upon a time",
           "max_tokens": 50
         }'

PreviousVerify Deployment NextTroubleshooting

Last updated 7 months ago