Kimi K2.6, served over an OpenAI-compatible API.

Pylo routes requests to Moonshot Kimi K2.6 and serves them back over an OpenAI-compatible API. Point your existing client at https://api.pylo.sh/v1 and change one line.

Pylo is the routing and reliability layer in front of the model. Requests fail over across upstreams automatically, and every request carries one stable ID. The model produces the tokens. Pylo keeps them flowing.

Endpoint

API base URL
https://api.pylo.sh/v1
Model slug
moonshotai/kimi-k2.6
Auth
Authorization: Bearer <key>

Read the Quickstart or jump to pricing.

Quickstart

Pylo speaks the OpenAI chat-completions API. If your code already talks to OpenAI, change the base URL and the model. Streaming works the same way.

curl

bash
curl https://api.pylo.sh/v1/chat/completions \
  -H "Authorization: Bearer $PYLO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "moonshotai/kimi-k2.6",
    "stream": true,
    "messages": [
      {"role": "user", "content": "Write a haiku about routing tables."}
    ]
  }'

Python (openai SDK)

python
import os

from openai import OpenAI

client = OpenAI(
    base_url="https://api.pylo.sh/v1",
    api_key=os.environ["PYLO_API_KEY"],
)

stream = client.chat.completions.create(
    model="moonshotai/kimi-k2.6",
    stream=True,
    messages=[
        {"role": "user", "content": "Write a haiku about routing tables."},
    ],
)

for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

Kimi K2.6

One model, one slug. Pylo exposes Kimi K2.6 with a 262144-token context window and text in, text out.

Model card

Model slug
moonshotai/kimi-k2.6
Display name
Kimi K2.6
Context length
262144 tokens
Input
text
Output
text
Supported features
reasoning, structured_outputs, tools

Pricing

ItemPrice (USD / 1M tokens)
Input$0.90
Output$3.90
Cache read$0.18

Launch pricing, subject to change.

Reliability

Pylo runs redundant upstream backends behind a single endpoint. A per-upstream circuit breaker tracks time-to-first-token and error rate over a rolling window. When the primary trips, requests fail over to the fallback automatically.

Pylo describes its failover mechanism, not a numeric uptime guarantee.

Data handling

Pylo retains request metadata for abuse and legal purposes and does not train on it. Pylo is not zero-data-retention. The audit log records the request, never your prompts or responses. See the Privacy page for the full policy.