Page cover

Token Usage Tab

The Token Usage tab page displays Total Input Tokens, Total Output Tokens, Total Tokens by API Key, and Total Tokens by Model.

At the top of the page on the right, there is a date filter feature that allows users to view token usage based on a specific time period. There are two main metrics that are visually displayed, namely Total Token Input: 369 tokens, and Total Token Output: 10,317 tokens. This means that the system produces far more responses (outputs) than the inputs provided by the user. A date filter feature is also available so that users can customize the usage report period.

At the bottom of the Token Usage page, there are two horizontal bar graphs that provide a visual representation of the distribution of token usage by API Key and LLM model used. The first graph, Total Tokens by API Key, shows that users with an API Key named Person2 are using the most tokens, followed by Person3 and Person1. This indicates that the highest interaction activity with the LLM model comes from Person2's account.

Meanwhile, the second graph, Total Tokens by Model, illustrates how token usage is spread among the available models. The Llama-4-Maverick-17B-128E-Instruct-quantized.w4a16 model is seen dominating the token usage, followed by Qwen/Qwen2.5-72B-Instruct, then BAAI/bge-multilingual-gemma2 with the least contribution. This visualization is very useful for analyzing the performance of the models used, evaluating the effectiveness of usage by each API Key, and understanding the workload of each model in the managed LLM system.

Last updated