Skip to main content
Latitude.sh AI Inference provides access to AI models through a unified, OpenAI-compatible API. Run inference on text generation, vision, code, and reasoning models from providers like Qwen, Meta, Google, Mistral, DeepSeek, and Moonshot AI.
AI Inference is currently available to select customers. Contact support to request access.

Features

  • API Keys: Generate and manage keys to authenticate your API requests
  • Models: Browse available models with pricing, context length, and capabilities
  • Playground: Test models interactively before integrating them
  • Overview: Monitor usage including requests, tokens, and costs

Managing API keys

1

Access API keys

Log in to the dashboard, select a project, and navigate to AI > API Keys.
2

Create a key

Click Create Key and provide a name for your key.
3

Copy your key

Copy the generated API key immediately. For security, the full key is only shown once.
API keys are prefixed with lat_ and can be deleted at any time from the API Keys page.

Browsing models

1

Access the models page

Log in to the dashboard, select a project, and navigate to AI > Models.
2

Filter models

Use the search bar to find models by name. Filter by provider or capability (text, vision, code, reasoning).
3

View model details

Each model shows its context length, input/output pricing per million tokens, and supported capabilities. Click on a model to open a side panel with a ready-to-use curl command.

Using the playground

1

Access the playground

Log in to the dashboard, select a project, and navigate to AI > Playground.
2

Configure your request

Select a model from the dropdown in the chat header. In the sidebar, enter your API key and optionally adjust the system prompt, temperature, and max tokens.
3

Send a message

Type your prompt and send it to see the model’s response in real-time.

Making API requests

The AI Inference API is available at https://api.lsh.ai and is fully compatible with the OpenAI SDK. You can also make direct HTTP requests.
from openai import OpenAI

client = OpenAI(
    base_url="https://api.lsh.ai/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="moonshotai/Kimi-K2.5",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
Replace YOUR_API_KEY with your API key. You can find available model IDs on the Models page.

Viewing metrics

The Overview page shows your AI Inference usage for the last 30 days:
  • Total Requests: Number of API requests made
  • Tokens Used: Total input and output tokens consumed
  • Total Cost: Cumulative cost of all API usage
Bar charts display requests and tokens by model, and a table shows usage metrics and costs for each model you’ve used.