How to Use GLM-5.2 with Claude Code (Step-by-Step)

You can run Claude Code against GLM-5.2 today by pointing two environment variables at Sota's API. No custom forks, no local GPU, no proprietary-model billing.

This guide walks through the full setup from a fresh Sota account to a working Claude Code session, with notes on switching to Kimi K2.7 Code and debugging common problems.

What you'll need

Claude Code installed (npm install -g @anthropic-ai/claude-code or the equivalent for your shell)
A Sota account (free to create, no credit card required to start)
Five minutes

That's it. Sota's API is OpenAI-compatible and Claude Code supports custom base URLs out of the box, so no patching or wrapper scripts are needed.

Step 1: Get a Sota API key

Sign in to Sota and open the dashboard.
Navigate to API Keys and create a new key. Name it something you'll recognize later (e.g., claude-code-dev).
Copy the key; it starts with sk-sota-. You won't be able to view it again after you close the modal.

Sota's Starter plan is $25/month and covers most individual developer workflows with per-user daily and weekly spend ceilings, so your bill stays predictable even when Claude Code is running long agentic loops.

Step 2: Point Claude Code at Sota

Set these two environment variables in your terminal before launching Claude Code:

export ANTHROPIC_BASE_URL="https://trysota.xyspace.dev"
export ANTHROPIC_API_KEY="sk-sota-…"
# Optional: switch to Kimi K2.7 Code (default is GLM-5.2)
export ANTHROPIC_MODEL="kimi-k2.7-code"
claude   # frontier open models, on Cloudflare's global network

The key points here:

ANTHROPIC_BASE_URL overrides where Claude Code sends its requests. Sota's endpoint accepts the same request shape Claude Code produces, so nothing else changes.
ANTHROPIC_API_KEY is your Sota key, not an Anthropic key. The variable name is the same because Claude Code reads it regardless of which provider is on the other end.
The ANTHROPIC_MODEL line above is optional. Leave it out to get GLM-5.2 (the default), or set it to switch models. See the Switching models section below.

If you want these settings to persist across sessions, add the export lines to your shell profile (~/.zshrc, ~/.bashrc, or equivalent). You can also scope them per project by putting them in a .env file and sourcing it before launching Claude Code.

Step 3: Run Claude Code

With those variables set, launch Claude Code the same way you always have:

claude

Claude Code will connect to Sota's API and route your prompts to GLM-5.2. From Claude Code's perspective, the interaction is identical: it issues the same tool calls and receives the same response shape. The model running the inference is different, served from Cloudflare's global network (US, UK, Germany, Japan, Australia) rather than a proprietary Anthropic endpoint.

Try a quick sanity check: ask Claude Code to explain a file in your current project. If you get a coherent response, you're live.

Switching models

Sota's default model is GLM-5.2, a strong general-purpose coding model from Z.ai. It's fast enough for interactive use and reliable across the kinds of tasks Claude Code runs most often.

If you want to switch to Kimi K2.7 Code from Moonshot, set the model environment variable:

export ANTHROPIC_MODEL="kimi-k2.7-code"

Kimi K2.7 Code is purpose-built for agentic software engineering, with a larger context window and deeper investment in multi-step tool use. It's the better choice when your Claude Code sessions involve long agentic loops, large codebase analysis, or complex automated pipelines. For everyday coding work (function generation, refactoring, test writing, code explanation), GLM-5.2 is usually the faster and more cost-efficient path.

You can switch models per-session by setting the variable before launching Claude Code, or mid-project by restarting Claude Code with the new variable set. No account changes or API key rotation needed.

For a deeper look at how the two models compare, see our GLM-5.2 vs Claude Sonnet 4.6 comparison.

Troubleshooting

Claude Code says "invalid API key" Double-check the key starts with sk-sota- and that you copied the full string. Sota keys don't have trailing spaces, but shell expansions can introduce them; run echo $ANTHROPIC_API_KEY to verify.

Requests are going to Anthropic instead of Sota Confirm ANTHROPIC_BASE_URL is set in the same shell session where you launched Claude Code. If you added it to your shell profile, make sure you sourced the file (source ~/.zshrc) or opened a new terminal.

Responses look truncated Sota normalizes responses to match the OpenAI/Claude expected shape, but some model responses include native formats that require normalization. If you're seeing cut-off replies, make sure you're on the latest version of Sota's API endpoint and that your Claude Code version is up to date.

Model isn't switching If you set ANTHROPIC_MODEL but the behavior hasn't changed, restart Claude Code. Environment variables are read at startup; changing them in a running session won't take effect.

Rate limits or unexpected cost Sota's plans include per-user spend ceilings. If you're hitting limits earlier than expected, check your plan's daily and weekly allowances in the dashboard. Upgrading to Pro ($125/month) raises those ceilings significantly.

Why route through Sota

The practical reasons for routing Claude Code through Sota rather than using native provider APIs:

Western inference. GLM-5.2's native API and Kimi K2.7 Code's native API both run inference in China. Sota routes both through Cloudflare's network (New York, London, Frankfurt, Tokyo, and Australian infrastructure), so your code doesn't leave Western infrastructure. For teams with data residency requirements, this is non-negotiable.

Predictable billing. Native token-based billing for high-volume Claude Code use accumulates fast and varies with each session. Sota's flat monthly pricing with per-user ceilings makes the cost predictable enough to budget and explain to finance.

One API for multiple models. Sota lets you switch between GLM-5.2 and Kimi K2.7 Code by changing one environment variable, with no separate accounts, separate billing relationships, or separate API key management.

OpenAI compatibility. Sota's API works with any tool that speaks OpenAI's request format. Claude Code works. Your own scripts calling the API work. Nothing needs to be rewritten.

For a full cost comparison between native Claude billing and open models via Sota, see The Real Cost of Claude Code. For alternatives if you're evaluating other approaches, see Claude Code Alternatives.

Get started with Sota and have GLM-5.2 running in Claude Code in under five minutes.