Calling OpenAI from Node.js | Hi, I'm Muhammad Amal

January 10, 2023 · 5 min read · by Muhammad Amal ai

TL;DR — npm install openai. TypeScript-first; matches the official SDK. Validate JSON with Zod. Retry with p-retry. Stream via EventSource parser or async iterators (next post). Same production concerns as Python: timeout, cost, rate limit, observability.

After Python, Node.js. Same shape, different ergonomics. Node SDK is 3.1.x as of Jan 2023.

Setup

npm install openai zod p-retry tiktoken

import { Configuration, OpenAIApi } from 'openai';

const config = new Configuration({
  apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(config);

Basic call

export async function complete(
  prompt: string,
  options: { maxTokens?: number; temperature?: number } = {}
): Promise<string> {
  const response = await openai.createCompletion({
    model: 'text-davinci-003',
    prompt,
    max_tokens: options.maxTokens ?? 300,
    temperature: options.temperature ?? 0,
  });
  return response.data.choices[0].text?.trim() ?? '';
}

JSON validation with Zod

import { z } from 'zod';

const TicketClass = z.object({
  category: z.enum(['billing', 'technical', 'account', 'other']),
  priority: z.enum(['low', 'medium', 'high']),
  confidence: z.number().min(0).max(1),
});

type TicketClass = z.infer<typeof TicketClass>;

export async function classifyTicket(text: string): Promise<TicketClass> {
  const prompt = `
Classify this support ticket as JSON.

Schema:
{
  "category": "billing|technical|account|other",
  "priority": "low|medium|high",
  "confidence": 0.0 to 1.0
}

Ticket: ${text}

JSON:
`.trim();

  const raw = await complete(prompt, { maxTokens: 100 });
  const parsed = JSON.parse(raw);
  return TicketClass.parse(parsed);
}

Zod is the TypeScript equivalent of Pydantic. Compile-time types + runtime validation in one shape.

If the model returns invalid JSON or wrong enum, parse() throws. Catch + retry or fallback.

Retries with p-retry

import pRetry from 'p-retry';

export async function completeWithRetry(prompt: string, opts = {}): Promise<string> {
  return pRetry(() => complete(prompt, opts), {
    retries: 5,
    minTimeout: 2000,
    maxTimeout: 30000,
    onFailedAttempt: (err) => {
      console.warn(`OpenAI attempt ${err.attemptNumber} failed: ${err.message}`);
    },
  });
}

p-retry handles exponential backoff. Don’t retry on 4xx errors (your fault); do retry on 5xx and timeouts.

For more control:

import pRetry, { AbortError } from 'p-retry';

await pRetry(async () => {
  try {
    return await complete(prompt);
  } catch (err: any) {
    if (err.response?.status === 400) {
      throw new AbortError(err);  // bad request — don't retry
    }
    throw err;
  }
}, { retries: 5 });

AbortError tells p-retry to stop.

Concurrency control

import pLimit from 'p-limit';

const limit = pLimit(10);  // 10 concurrent at most

export async function batchClassify(tickets: string[]): Promise<TicketClass[]> {
  return Promise.all(tickets.map(t => limit(() => classifyTicket(t))));
}

p-limit caps in-flight requests. Stays under OpenAI’s rate limit.

Token counting

import { encoding_for_model } from 'tiktoken';

const enc = encoding_for_model('text-davinci-003');

export function tokenCount(text: string): number {
  return enc.encode(text).length;
}

export function estimateCost(prompt: string, maxResponseTokens: number): number {
  const total = tokenCount(prompt) + maxResponseTokens;
  return total * 0.02 / 1000;
}

Same pattern as Python.

Express middleware example

For a typical Express endpoint:

import express from 'express';

const app = express();
app.use(express.json());

app.post('/classify-ticket', async (req, res) => {
  const { text } = req.body;
  if (typeof text !== 'string' || text.length === 0 || text.length > 5000) {
    return res.status(400).json({ error: 'invalid text' });
  }

  try {
    const result = await classifyTicket(text);
    res.json(result);
  } catch (err: any) {
    console.error('classify failed', err);
    res.status(500).json({ error: 'classification failed' });
  }
});

Three guards:

Input length cap (prevents prompt injection size attacks)
Type validation
Error handling with no internal-error leak

Streaming preview

The OpenAI Node SDK 3.x supports streaming via Server-Sent Events. Full pattern covered Friday in streaming responses, but quickly:

const response = await openai.createCompletion({
  model: 'text-davinci-003',
  prompt,
  stream: true,
}, { responseType: 'stream' });

(response.data as any).on('data', (chunk: Buffer) => {
  const lines = chunk.toString().split('\n').filter(Boolean);
  for (const line of lines) {
    if (line.startsWith('data: ')) {
      const data = line.slice(6);
      if (data === '[DONE]') return;
      const json = JSON.parse(data);
      process.stdout.write(json.choices[0].text);
    }
  }
});

Awkward in 3.x. The 4.x SDK (later in 2023) cleans this up with async iterators.

TypeScript advantages

Over Python, Node has two things going for it in 2023:

Native streaming via fetch. Node 18’s fetch supports streams; cleaner than the axios-based 3.x SDK eventually.
Type-safe prompt templates. With template literal types, you can define prompts that require specific variables.

type PromptVars = { schema: string; ticket: string };
const CLASSIFY_TICKET = ({ schema, ticket }: PromptVars) => `
Classify ...
Schema: ${schema}
Ticket: ${ticket}
JSON:`;

Compile-time guarantee that variables are passed correctly.

Production observability

import pino from 'pino';
const log = pino();

export async function completeObserved(prompt: string, opts = {}): Promise<string> {
  const start = Date.now();
  try {
    const result = await completeWithRetry(prompt, opts);
    log.info({
      duration_ms: Date.now() - start,
      tokens_in: tokenCount(prompt),
      tokens_out: tokenCount(result),
      cost_usd: estimateCost(prompt, opts.maxTokens ?? 300),
    }, 'openai_call');
    return result;
  } catch (err: any) {
    log.error({
      duration_ms: Date.now() - start,
      error: err.message,
    }, 'openai_call_failed');
    throw err;
  }
}

Common Pitfalls

API key in client-side code. Browser-bundled with the key = stolen instantly. Always backend-only.

No retry. OpenAI’s API has bad days; your service shouldn’t share them.

No timeout. Default Node fetch has no timeout. Add one explicitly.

Parsing JSON without validation. Use Zod.

Synchronous calls in handlers. Block event loop; degrade service. Use async properly.

Long prompts without truncation. Context limit (4097 tokens). User input may push past.

Stream parsing wrong. SDK 3.x stream events differ between completion and chat (coming later in 2023).

Wrapping Up

openai + Zod + p-retry + p-limit + tiktoken = working Node.js stack for OpenAI integration in Jan 2023. Friday: prompt engineering basics.

Setup

Basic call

JSON validation with Zod

Retries with p-retry

Concurrency control

Token counting

Express middleware example

Streaming preview

TypeScript advantages

Production observability

Common Pitfalls

Wrapping Up

Related posts

Prompt Engineering Basics for Engineers

Calling OpenAI from Python, Patterns and Pitfalls

Why Every Backend Needs an LLM Integration in 2023

The OpenAI Assistants API in Production, A Cautious Take

Migrating to GPT-4 Turbo, What 128K Context Actually Changes

Error Handling and Retries for LLM APIs

LLM Cost Control and Token Budgets

Streaming Responses from LLM APIs

Let’s Start a Project