How to verify and use your Google Gemini API key — the simplest LLM key to start with
Of the big three, Google Gemini has the friendliest on-ramp: an API key from Google AI
Studio, not the full Google Cloud OAuth dance. But its request shape is the odd one out
— the model lives in the URL, not the body — and there’s a 2026 rule that quietly 403s
brand-new keys. Here’s how to verify a Gemini key and send your first request cleanly.
I built a free tool, API Studio, to make this a two-second job: paste your key, click Verify, done. The key is saved only in your browser; running a request relays it once through a stateless proxy that stores nothing and logs nothing.
Get the key from AI Studio, not Cloud Console
The simple key — the one that starts with AIza — comes from Google AI Studio,
not the heavier Google Cloud project flow. (That simpler endpoint,
generativelanguage.googleapis.com, is the Gemini Developer API, distinct from
Vertex AI, which does use full GCP auth.) Create one, and you can call the API
immediately on the free tier.
Verify the key: list the models
curl https://generativelanguage.googleapis.com/v1beta/models \
-H "x-goog-api-key: $GEMINI_API_KEY"
A 200 lists every model your key can call (and which methods each supports, like
generateContent). A 400/403 means the key is bad — or unrestricted (see below).
In API Studio this is the Verify key button.
Put the key in the x-goog-api-key header, not the ?key= query string. Both work,
but the header keeps your key out of URLs, server logs, and referrer headers — a small
habit that saves you from leaking a key into a log file later.
The 2026 gotcha: restrict your key
As of mid-2026, Google can reject requests from fully-unrestricted standard API
keys. If verify comes back 403 on a key you know is real, this is the most likely
reason: open the key in AI Studio and add an API restriction (limit it to the Generative
Language API, and ideally an app/referrer). Then it works.
Sending your first request
Gemini’s shape is distinct from the OpenAI/Anthropic one. The model goes in the URL
path as …/models/<model>:generateContent, and the prompt is nested under
contents[].parts[]:
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{ "parts": [{ "text": "In one short sentence, say hello." }] }],
"generationConfig": { "maxOutputTokens": 64 }
}'
A few shape notes that catch people:
- The reply text is at
candidates[0].content.parts[0].text— notchoicesorcontentblocks. - The output cap is
generationConfig.maxOutputTokens, notmax_tokens. - The system prompt is a top-level
systemInstructionobject, not asystemmessage. - Streaming uses a different verb:
:streamGenerateContent(add?alt=ssefor real Server-Sent-Events).
API Studio’s Generate content preset fills all of this in against a cheap model with a small cap, so a test costs almost nothing on your own account.
Which model should I use?
As of mid-2026, the current line-up:
| Model | Good for |
|---|---|
gemini-2.5-flash-lite | the cheapest, fastest probe |
gemini-2.5-flash | the everyday default |
gemini-2.5-pro | the heavier reasoning work |
gemini-3.5-flash | the newest fast model |
When in doubt, list the models with the verify call and pick from what your key actually returns.
One handy detail
If you already have OpenAI-shaped code, Google also exposes an OpenAI-compatible
endpoint at …/v1beta/openai/ (Bearer auth, /chat/completions semantics). It’s a
quick way to point an existing client at Gemini without rewriting the request shape —
though the native generateContent API gives you the full feature set.
Why a tool instead of curl?
A browser can’t call generativelanguage.googleapis.com directly (CORS), so API Studio
relays your request once through a stateless hop that keeps nothing — and if you’d
rather trust no one, it hands you the exact cURL, Node, or Python so you can run it
yourself and skip our servers entirely.
Verify your Gemini key in API Studio →
Getting the key working is the easy part. The real work is the integration behind it — the pipeline that moves your data between your tools and keeps itself running. That’s the day job; if you’d like a hand, send me a brief.