Performance
The Performance menu is used to monitor and analyze the performance of LLM model usage based on API request activity. On this page, the system displays a range of information, from metric summaries and analysis filters to detailed records of each request. On the Deka LLM page, select the Performance tab.

Users can select a time range using the Last 1 Day or Custom options to adjust the performance analysis.
On the Performance page, navigate to the upper-right area of the metrics summary → click the Last 1 Day option to select the automatic time range for the last 1 day.

Next, on the Performance page, click Custom → click the Select Range (Max 2 Days) field located at the upper-right area of the performance metrics → select the start date, then select the end date with a maximum range of 2 days → click the Select button to apply the filter.

At the top of the page, four main metrics are displayed: Avg TTFT, Avg TPS, Avg Latency, and Total Requests. These metrics provide an overview of the model’s response speed, token processing performance, and the total number of requests received within a specified time period.

Below are several explanations related to the performance analytics dashboard:
Avg TTFT
Indicates the average time required for the model to generate the first token after receiving a request.
Avg TPS
Measures the model’s speed in generating tokens per second.
Avg Latency
Shows the total average time needed to process a single request, starting from when the request is received until the entire response is delivered.
Total Requests
Displays the total number of API requests received and processed within the selected time range or filter.
Next, several filters are available to help users refine the data they want to analyze, including Filter by API Key, Filter by Model, and Filter by Status. These filters are useful for displaying performance data based on specific API keys, the LLM model used, or the response status generated.

On the Performance page, locate the filter section below the metrics summary → click the Filter by API Key dropdown → select one of the API keys you want to analyze.

Next, navigate to the Filter by Model dropdown located beside the API Key filter → click the dropdown to view the list of available LLM models.

In the filter section, click the Filter by Status dropdown → select the request status you want to display, such as success, failed, or timeout.

Last updated
