OpenAI API Integration Tutorial | 2026 Python SDK Complete Guide from Scratch
OpenAI API Integration Tutorial | 2026 Python SDK Complete Guide from Scratch
3 Lines of Python to Call GPT
Many people think integrating AI APIs is complex.
It's not.
OpenAI's Python SDK is designed to be very clean. Install the package, set the key, call the API -- three steps done. This tutorial walks you through the entire process step by step, with copy-paste-ready code at every stage.
Whether you're a first-time AI API user or a developer migrating from another platform, this guide is for you.
Want to get OpenAI API? Purchase through CloudInsight -- no credit card issues, enterprise discounts and invoices included.

TL;DR
Install the openai package -> Set API Key environment variable -> Call client.chat.completions.create() and you're done. This tutorial covers text generation, multi-turn conversations, Streaming, image analysis, and Function Calling, with complete runnable code examples.
Environment Setup & OpenAI SDK Installation
Answer-First: You need Python 3.8+ and pip, plus a single pip install openai to get started.
Installation Steps
# Recommended: use a virtual environment
python -m venv openai-env
source openai-env/bin/activate # macOS / Linux
# openai-env\Scripts\activate # Windows
# Install OpenAI SDK
pip install openai
Verify installation:
python -c "import openai; print(openai.__version__)"
System Requirements
| Item | Requirement |
|---|---|
| Python | 3.8 or above (3.11+ recommended) |
| openai package | Latest 1.x version |
| OS | Windows / macOS / Linux |
| Network | Must be able to connect to api.openai.com |
Obtaining API Key & Security Configuration
Answer-First: Create an API Key at platform.openai.com, store it in environment variables, and never hardcode it in your source code.
Create an API Key
- Log in to platform.openai.com
- Click "API Keys" in the left sidebar
- Click "Create new secret key"
- Copy the generated key
For complete account registration steps, refer to OpenAI API Registration Complete Tutorial.
Securely Store the API Key
# macOS / Linux - add to ~/.bashrc or ~/.zshrc
export OPENAI_API_KEY="sk-your-key-here"
# Windows PowerShell
$env:OPENAI_API_KEY="sk-your-key-here"
In Python, the SDK automatically reads the OPENAI_API_KEY environment variable:
from openai import OpenAI
# Automatically reads API Key from environment variable
client = OpenAI()
# Or specify manually
client = OpenAI(api_key="sk-your-key-here") # Not recommended
Security reminders:
- Never commit API Keys to Git
- If using
.envfiles, add them to.gitignore - For production, use Secret Managers (e.g., AWS Secrets Manager, GCP Secret Manager)
For more complete API Key security management practices, refer to API Key Management & Security Best Practices.
Text Generation: Your First API Call
Answer-First: Use client.chat.completions.create() with a model name and messages array to get AI text responses.
The Simplest Call
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "Explain what an API is in one sentence"}
]
)
print(response.choices[0].message.content)
That's it. 5 lines of effective code.
Using System Prompt to Control AI Behavior
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a senior software engineer. Answer in a casual yet professional tone."},
{"role": "user", "content": "What is a REST API?"}
],
temperature=0.7,
max_tokens=500
)
The system role message sets the AI's behavior pattern. A good System Prompt can significantly improve response quality.
Parameter Tuning Guide
| Parameter | Description | Recommended Value |
|---|---|---|
| temperature | Creativity level (0-2) | Translation 0.1, Q&A 0.7, Creative 1.2 |
| max_tokens | Maximum output length | Set based on needs; smaller saves money |
| top_p | Sampling range | Usually adjust either this or temperature, not both |
| frequency_penalty | Avoid repetition (0-2) | 0.3-0.5 reduces repetition |
Multi-Turn Conversations
messages = [
{"role": "system", "content": "You are a friendly assistant"},
{"role": "user", "content": "Which is better for beginners, Python or JavaScript?"},
]
response = client.chat.completions.create(model="gpt-4o", messages=messages)
assistant_reply = response.choices[0].message.content
print(assistant_reply)
# Continue the conversation: add the AI's response back
messages.append({"role": "assistant", "content": assistant_reply})
messages.append({"role": "user", "content": "What if I want to build websites?"})
response = client.chat.completions.create(model="gpt-4o", messages=messages)
print(response.choices[0].message.content)
The key to multi-turn conversations: pass the complete conversation history with each call. This is how the AI understands context.
But be careful -- the longer the conversation, the more tokens consumed. When you exceed the Context Window limit, you'll need to truncate earlier messages.
Purchase OpenAI API through CloudInsight for exclusive enterprise discounts and invoices. Learn about enterprise plans

Streaming Responses: Display AI Replies in Real Time
Answer-First: Add the stream=True parameter to display AI responses character by character in real time, significantly improving user experience.
stream = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "Recommend 5 must-visit night markets in Taiwan"}
],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="", flush=True)
Streaming is particularly useful in these scenarios:
- Chatbots: Users don't have to stare at a blank screen
- Long responses: Users can start reading early during long text generation
- Real-time feel: Makes AI responses feel more like human conversation
Downside: In streaming mode, you can't get usage info (token consumption) all at once. You need to set stream_options={"include_usage": True} to get it in the final chunk.
Image Analysis: Vision API Usage
Answer-First: GPT-4o and GPT-5 support image input. Simply include an image URL or base64 encoding in the messages to let AI understand image content.
Sending Images via URL
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{
"type": "image_url",
"image_url": {"url": "https://example.com/photo.jpg"}
}
]
}
]
)
Sending Local Images via Base64
import base64
with open("receipt.jpg", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Please identify the amount on this receipt"},
{
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{image_data}"}
}
]
}
]
)
Practical use cases:
- Receipt/invoice OCR recognition
- Product image classification
- UI screenshot analysis
- Chart data extraction
Note: Image analysis consumes significantly more tokens. One image is roughly equivalent to 85-1,700 tokens, depending on resolution.
Function Calling: Let AI Use Tools
Answer-First: Function Calling lets you define a list of functions. The AI determines when to call which function with the correct parameters -- ideal for building AI Agents that can query data and operate systems.
import json
# Define tools
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get real-time weather information for a specified city",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "City name, e.g., Tokyo, New York"
}
},
"required": ["city"]
}
}
}
]
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What's the weather like in Tokyo today?"}],
tools=tools
)
# Check if the AI wants to call a function
message = response.choices[0].message
if message.tool_calls:
tool_call = message.tool_calls[0]
function_name = tool_call.function.name
arguments = json.loads(tool_call.function.arguments)
print(f"AI wants to call: {function_name}({arguments})")
Function Calling workflow:
- You define a list of available functions
- User asks a question
- AI determines whether a function call is needed
- If needed, AI returns the function name and parameters
- You execute the function locally and get results
- You send results back to the AI, which responds to the user in natural language
Error Handling & Best Practices
Answer-First: Production environments must handle common errors like Rate Limit (429), Timeout, and Invalid Request, with exponential backoff retry mechanisms for stability.
Complete Error Handling Example
from openai import OpenAI, APIError, RateLimitError, APIConnectionError
import time
client = OpenAI()
def call_openai_with_retry(messages, max_retries=3):
for attempt in range(max_retries):
try:
response = client.chat.completions.create(
model="gpt-4o",
messages=messages
)
return response.choices[0].message.content
except RateLimitError:
wait_time = 2 ** attempt # 1, 2, 4 seconds
print(f"Rate limit exceeded, retrying in {wait_time} seconds...")
time.sleep(wait_time)
except APIConnectionError:
print("Connection failed, retrying in 2 seconds...")
time.sleep(2)
except APIError as e:
print(f"API error: {e}")
raise
raise Exception("Max retries exceeded")
Common Errors
| Error | Cause | Solution |
|---|---|---|
| RateLimitError (429) | Too many requests | Exponential backoff retry |
| AuthenticationError (401) | Invalid API Key | Verify key is correct |
| BadRequestError (400) | Malformed request | Check messages format and parameters |
| APIConnectionError | Network issues | Check network, retry later |
| InternalServerError (500) | OpenAI server issue | Wait a few seconds and retry |

Next Steps: From Examples to Products
You've learned the basics of OpenAI API Python integration.
What's next?
- Try different models: Use GPT-4o-mini for simple tasks to save money, GPT-4o or GPT-5 for complex tasks
- Learn the Assistants API: Build more complete AI assistants
- Understand cost control: Set Budget Limits, monitor token usage
- Build RAG systems: Combine with Embeddings API for knowledge base Q&A
For the complete features and enterprise plans of OpenAI API, check out GPT-5 & OpenAI API Complete Guide.
If you're curious about GPT-5's capabilities and pricing, What Is GPT-5? Latest Features & Tutorial has a more detailed analysis. If you're also interested in Google's AI API, Gemini API Complete Developer Guide is a great comparison reference. For a pricing summary across providers, check out AI API Pricing Comparison Guide.
Need enterprise-grade OpenAI API plans? CloudInsight offers bulk token purchasing discounts, invoices, and technical support. Get a quote for enterprise plans, or join our LINE official account for instant technical support.
FAQ
Q1: What's the difference between OpenAI API and Azure OpenAI? Which should enterprises pick?
They serve the same models but with different contracts, deployment, and data protection. (1) OpenAI API (api.openai.com) — direct contract with OpenAI, models update immediately (new models typically launch on OpenAI first), transparent pricing, simple management; fits startups / SaaS. But data protection is OpenAI's standard terms, with relatively fewer compliance certifications (SOC 2 Type 2 yes, but some large enterprises require more). (2) Azure OpenAI — rent OpenAI models through Microsoft Azure; enterprise-grade contracts, optional data residency, HIPAA / FedRAMP High certifications, Microsoft 365 ecosystem integration, Azure AD / IAM, Azure policy management. Cons: (A) new models arrive 2–8 weeks later than OpenAI; (B) requires access application (especially GPT-4o, o1); (C) pricing same as OpenAI but more complex; (D) UI is enterprise-style, less usable than OpenAI direct. Selection: (A) personal / startup → OpenAI API; (B) Microsoft enterprise customer (already using Azure AD, Office 365) → Azure OpenAI; (C) finance / healthcare / government needing strong compliance → Azure OpenAI (FedRAMP, HIPAA coverage broader); (D) want latest models → OpenAI API; (E) data sovereignty requirements → Azure OpenAI (can specify region).
Q2: With so many OpenAI models (GPT-4o, o1, o3, GPT-4.5, GPT-4 Turbo...), which to pick?
2025 rules of thumb. (1) Daily tasks first choice: GPT-4o or GPT-4o mini — fast, cheap, multimodal, fits 90% of use cases. GPT-4o $2.50/$10.00 per 1M tokens, mini $0.15/$0.60 per 1M. (2) Need reasoning: o1 / o1-mini / o3 — complex reasoning, math, hard coding. o1 expensive but noticeably effective ($15/$60 per 1M tokens). (3) Budget-extremely-tight: GPT-3.5 Turbo or GPT-4o mini — lower quality but cheap. (4) Special needs: (A) Realtime API: low-latency voice conversation; (B) Whisper: speech-to-text; (C) TTS: text-to-speech; (D) DALL-E 3: image generation; (E) Embedding (text-embedding-3-small/large): for RAG. When to use o1 / o3: (A) math proofs, complex coding (not boilerplate writing), multi-step planning; (B) correctness matters more than speed and cost. When to use GPT-4o: (A) chatbots, content generation, summarization, translation; (B) need speed (o1 is slow, 10–30 seconds average); (C) multimodal (vision, audio). Don't use GPT-4 Turbo — superseded by GPT-4o across the board. Dev recommendation: prototype with GPT-4o, in production A/B test to decide stick or upgrade to o1.
Q3: What's the difference between OpenAI's "Assistants API" and "Chat Completions API"? Which to use?
Assistants API is a stateful abstraction; Chat Completions is the stateless raw API. (1) Chat Completions API (/v1/chat/completions) — (A) you send complete message history with each request; (B) stateless; (C) you manage conversation memory; (D) core features: tools (function calling), JSON mode, streaming; (E) fits: when you want full flow control and have your own session management. (2) Assistants API (/v1/assistants) — (A) OpenAI manages conversation threads for you; (B) built-in tools: Code Interpreter (runs Python), File Search (RAG), Function Calling; (C) upload files, assistant auto-references; (D) fits: chatbots, ChatGPT-like products, need persistent memory. Comparison: (A) Simplicity: Assistants > Chat Completions (no history management); (B) Flexibility: Chat Completions > Assistants (you control everything); (C) Cost: Chat Completions cheaper (Assistants' thread storage, file search all incur extra); (D) Debugging: Chat Completions easy (pure I/O), Assistants has abstraction layers making debugging hard. Selection: (A) Beginner / rapid chatbot build → Assistants API; (B) Need maximum control, cost optimization → Chat Completions; (C) Need Code Interpreter (runs Python) or File Search (RAG) → Assistants (DIY takes much longer); (D) Production systems, complex flows → Chat Completions (more work but controllable). Note: Assistants API is still in Beta with possible major changes in 2025 — use cautiously in production.
Q4: How to control OpenAI API costs? Will the monthly bill explode?
Yes, it can — but with techniques to prevent. Common explosion causes: (1) Loop calls — bug causing infinite API calls, burning $1,000 in one afternoon; (2) Context too long — conversation accumulating to 20,000 tokens, $0.05 per round; (3) Wrong model — using o1 for simple tasks (o1 is 10x more expensive); (4) Batch processing not using Batch API — missing 50% discount; (5) No per-user quotas — one user spamming consumes the whole month's budget. Control strategies (simple to advanced): (A) Set Hard Limit — OpenAI dashboard can set monthly spending cap, blocks on exceed; (B) Usage alerts — 50% / 80% threshold email alerts; (C) Monitor tokens per request — log each call's input/output tokens, investigate anomalies; (D) Caching — cache identical prompts in Redis for 30 minutes; (E) Cheaper model fallback — try GPT-4o mini first, upgrade to GPT-4o only if quality insufficient; (F) Truncate history — beyond N rounds, keep only last 10 + summary; (G) Batch API — non-real-time tasks in batch, 50% off; (H) Prompt engineering saves tokens — clear concise prompts save 30%+ tokens. Case study: customer's monthly bill from $5,000 → $800 after optimization: (1) GPT-4 Turbo → GPT-4o mini ($3,000 → $200); (2) Added Redis cache (40% savings); (3) Non-real-time tasks switched to Batch API (another 50% off). At $500+/month, take cost management seriously.
Q5: For OpenAI API in production, how to ensure reliability?
Multi-layer redundancy + good error handling. OpenAI API actual availability: last year's SLA was ~99.5–99.8%, but with some extended outages (June 2024 3-hour, November 2024 2-hour). Availability strategies: (1) Retry with exponential backoff — 429 rate limits and 5xx all need retry, at least 3 times; (2) Multi-provider fallback — primary on OpenAI, fallback to Anthropic Claude or Google Gemini; use OpenRouter or LiteLLM for one API with multiple provider switching; (3) Status page monitoring — subscribe to status.openai.com for instant incident notifications; (4) Circuit breaker pattern — after 5 consecutive failures, pause for 5 minutes to avoid cascading failures; (5) Client-side timeout — don't wait indefinitely, 30-second timeout reasonable; (6) Graceful degradation — when OpenAI is down, show users "AI temporarily unavailable, please try again later" instead of blank screen; (7) Cache last-known-good — cache results of common queries; when API is down, at least basic responses are available. Architectural practice: (1) wrap LLM calls in a wrapper function with unified retry / fallback / logging; (2) use OpenTelemetry to track each API call's latency / error rate; (3) set up alerting: error rate >5% alert; (4) reserve manual switch: dashboard one-click provider switching in emergencies. Azure OpenAI advantages: 99.9% SLA + technical support + auto-failover across regions — pick it for high availability needs.
References
- OpenAI -- Python SDK Documentation (https://platform.openai.com/docs/libraries/python-library)
- OpenAI -- Chat Completions API Reference (https://platform.openai.com/docs/api-reference/chat)
- OpenAI -- Vision Guide (https://platform.openai.com/docs/guides/vision)
- OpenAI Cookbook -- GitHub (https://github.com/openai/openai-cookbook)
Need Professional Cloud Advice?
Whether you're evaluating cloud platforms, optimizing existing architecture, or looking for cost-saving solutions, we can help
Book Free ConsultationRelated Articles
Gemini API Python Tutorial: 2026 Complete Guide to Calling Google AI Models from Scratch
2026 Gemini API Python integration complete tutorial. From SDK installation, API Key setup to implementing text generation and image understanding, with full code examples for beginners to quickly get started with Google Gemini development.
AI APIClaude API Integration Tutorial | 2026 Anthropic API Complete Beginner's Guide
2026 Claude API integration tutorial! From Anthropic API Key setup, Python SDK installation to your first code example, a step-by-step guide to Claude API integration.
AI APIOpenAI API Tutorial | 2026 Complete Guide from API Key to Code Examples
2026 OpenAI API tutorial! From API Key application, Python environment setup to complete code examples -- beginners can get started with OpenAI API quickly.