LiteLLM Performance Dashboard

This page displays performance metrics collected from LiteLLM model endpoints. Two test prompts are used: simple-prompt (single-word response testing baseline latency) and complex-prompt (multi-step planning task for reasoning evaluation). Charts show latency, token consumption, and cost data across models.

View Mode

Select Data

Total Response Time

Complete time from request to full response. Lower is better for overall speed.

View:

Token Usage Across Models

View:

Token Comparison: Simple vs Complex Prompts

View:

Efficiency Analysis: Latency vs Tokens

Scatter plot comparing speed vs resource usage. Models in the bottom-left are fastest and most efficient.

View: