Calling OpenAI from Node.js
TL;DR —
npm install openai. TypeScript-first; matches the official SDK. Validate JSON with Zod. Retry withp-retry. Stream viaEventSourceparser or async iterators (next post). Same production concerns as Python: timeout, cost, rate limit, observability.
After Python, Node.js. Same shape, different ergonomics. Node SDK is 3.1.x as of Jan 2023.
Setup
npm install openai zod p-retry tiktoken
import { Configuration, OpenAIApi } from 'openai';
const config = new Configuration({
apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(config);
Basic call
export async function complete(
prompt: string,
options: { maxTokens?: number; temperature?: number } = {}
): Promise<string> {
const response = await openai.createCompletion({
model: 'text-davinci-003',
prompt,
max_tokens: options.maxTokens ?? 300,
temperature: options.temperature ?? 0,
});
return response.data.choices[0].text?.trim() ?? '';
}
JSON validation with Zod
import { z } from 'zod';
const TicketClass = z.object({
category: z.enum(['billing', 'technical', 'account', 'other']),
priority: z.enum(['low', 'medium', 'high']),
confidence: z.number().min(0).max(1),
});
type TicketClass = z.infer<typeof TicketClass>;
export async function classifyTicket(text: string): Promise<TicketClass> {
const prompt = `
Classify this support ticket as JSON.
Schema:
{
"category": "billing|technical|account|other",
"priority": "low|medium|high",
"confidence": 0.0 to 1.0
}
Ticket: ${text}
JSON:
`.trim();
const raw = await complete(prompt, { maxTokens: 100 });
const parsed = JSON.parse(raw);
return TicketClass.parse(parsed);
}
Zod is the TypeScript equivalent of Pydantic. Compile-time types + runtime validation in one shape.
If the model returns invalid JSON or wrong enum, parse() throws. Catch + retry or fallback.
Retries with p-retry
import pRetry from 'p-retry';
export async function completeWithRetry(prompt: string, opts = {}): Promise<string> {
return pRetry(() => complete(prompt, opts), {
retries: 5,
minTimeout: 2000,
maxTimeout: 30000,
onFailedAttempt: (err) => {
console.warn(`OpenAI attempt ${err.attemptNumber} failed: ${err.message}`);
},
});
}
p-retry handles exponential backoff. Don’t retry on 4xx errors (your fault); do retry on 5xx and timeouts.
For more control:
import pRetry, { AbortError } from 'p-retry';
await pRetry(async () => {
try {
return await complete(prompt);
} catch (err: any) {
if (err.response?.status === 400) {
throw new AbortError(err); // bad request — don't retry
}
throw err;
}
}, { retries: 5 });
AbortError tells p-retry to stop.
Concurrency control
import pLimit from 'p-limit';
const limit = pLimit(10); // 10 concurrent at most
export async function batchClassify(tickets: string[]): Promise<TicketClass[]> {
return Promise.all(tickets.map(t => limit(() => classifyTicket(t))));
}
p-limit caps in-flight requests. Stays under OpenAI’s rate limit.
Token counting
import { encoding_for_model } from 'tiktoken';
const enc = encoding_for_model('text-davinci-003');
export function tokenCount(text: string): number {
return enc.encode(text).length;
}
export function estimateCost(prompt: string, maxResponseTokens: number): number {
const total = tokenCount(prompt) + maxResponseTokens;
return total * 0.02 / 1000;
}
Same pattern as Python.
Express middleware example
For a typical Express endpoint:
import express from 'express';
const app = express();
app.use(express.json());
app.post('/classify-ticket', async (req, res) => {
const { text } = req.body;
if (typeof text !== 'string' || text.length === 0 || text.length > 5000) {
return res.status(400).json({ error: 'invalid text' });
}
try {
const result = await classifyTicket(text);
res.json(result);
} catch (err: any) {
console.error('classify failed', err);
res.status(500).json({ error: 'classification failed' });
}
});
Three guards:
- Input length cap (prevents prompt injection size attacks)
- Type validation
- Error handling with no internal-error leak
Streaming preview
The OpenAI Node SDK 3.x supports streaming via Server-Sent Events. Full pattern covered Friday in streaming responses, but quickly:
const response = await openai.createCompletion({
model: 'text-davinci-003',
prompt,
stream: true,
}, { responseType: 'stream' });
(response.data as any).on('data', (chunk: Buffer) => {
const lines = chunk.toString().split('\n').filter(Boolean);
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6);
if (data === '[DONE]') return;
const json = JSON.parse(data);
process.stdout.write(json.choices[0].text);
}
}
});
Awkward in 3.x. The 4.x SDK (later in 2023) cleans this up with async iterators.
TypeScript advantages
Over Python, Node has two things going for it in 2023:
- Native streaming via fetch. Node 18’s fetch supports streams; cleaner than the axios-based 3.x SDK eventually.
- Type-safe prompt templates. With template literal types, you can define prompts that require specific variables.
type PromptVars = { schema: string; ticket: string };
const CLASSIFY_TICKET = ({ schema, ticket }: PromptVars) => `
Classify ...
Schema: ${schema}
Ticket: ${ticket}
JSON:`;
Compile-time guarantee that variables are passed correctly.
Production observability
import pino from 'pino';
const log = pino();
export async function completeObserved(prompt: string, opts = {}): Promise<string> {
const start = Date.now();
try {
const result = await completeWithRetry(prompt, opts);
log.info({
duration_ms: Date.now() - start,
tokens_in: tokenCount(prompt),
tokens_out: tokenCount(result),
cost_usd: estimateCost(prompt, opts.maxTokens ?? 300),
}, 'openai_call');
return result;
} catch (err: any) {
log.error({
duration_ms: Date.now() - start,
error: err.message,
}, 'openai_call_failed');
throw err;
}
}
Common Pitfalls
API key in client-side code. Browser-bundled with the key = stolen instantly. Always backend-only.
No retry. OpenAI’s API has bad days; your service shouldn’t share them.
No timeout. Default Node fetch has no timeout. Add one explicitly.
Parsing JSON without validation. Use Zod.
Synchronous calls in handlers. Block event loop; degrade service. Use async properly.
Long prompts without truncation. Context limit (4097 tokens). User input may push past.
Stream parsing wrong. SDK 3.x stream events differ between completion and chat (coming later in 2023).
Wrapping Up
openai + Zod + p-retry + p-limit + tiktoken = working Node.js stack for OpenAI integration in Jan 2023. Friday: prompt engineering basics.