# OpenAI

Lightweight, dependency-free, in-memory OpenAI REST API fake for testing code that uses the real `openai` Node.js/Python SDK (and the language-agnostic OpenAI REST API). All generated content is **deterministic** — text and embedding vectors are derived from a hash of the input so tests are repeatable.

Default port: `4747`

## Quick start

Start the server:

```js
import { OpenaiServer } from "./services/openai/src/server.js";

const server = new OpenaiServer(4747);
await server.start();
// ... run your app/tests ...
await server.stop();
```

Point the real `openai` client at it via `baseURL`:

```js
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "sk-parlel",
  baseURL: "http://127.0.0.1:4747/v1",
});

const completion = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});
// completion.choices[0].message.content => deterministic text for that prompt
```

## Implemented operations

All `/v1/*` routes require an `Authorization: Bearer <key>` header (any non-empty bearer token is accepted, matching how a local test key behaves). State is in-memory and ephemeral.

- `POST /v1/chat/completions` — chat completion. Supports `stream: true` (SSE chunks ending with `data: [DONE]`). Returns the `chat.completion` shape with `choices[].message`, `finish_reason`, `usage`.
- `POST /v1/completions` — legacy text completion (`text_completion`).
- `POST /v1/embeddings` — deterministic fixed-length embedding vectors (default 1536 dims, `dimensions` override supported). Accepts a string or array of strings.
- `GET /v1/models` — list models (`{ object: "list", data: [...] }`).
- `GET /v1/models/{id}` — retrieve one model.
- `POST /v1/images/generations` — image generation. Returns `url` or `b64_json` depending on `response_format`.
- `POST /v1/moderations` — content moderation. Returns `flagged`, `categories`, `category_scores`.

### Service & inspection operations (parlel extensions)

- `GET /` — service metadata.
- `GET /health` — health check (`{ status: "ok" }`).
- `POST /__parlel/reset` — reset all in-memory state.
- `GET /__parlel/requests` — list every captured request (`{ requests, count }`).
- `DELETE /__parlel/requests` — clear the captured request log.
- `OPTIONS *` — CORS preflight (`204`).

## SDK usage example

```python
from openai import OpenAI

client = OpenAI(api_key="sk-parlel", base_url="http://127.0.0.1:4747/v1")

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Stream me"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")
```

## Access via MCP / preview URL

HTTP services are auto-exposed at the Daytona preview URL. Send requests to the preview URL with the `x-daytona-preview-token` header for authentication. The OpenAI surface is mounted under `/v1`, so set `OPENAI_BASE_URL` to `<preview-url>/v1`.

## Surface coverage

This emulator faithfully replicates the API surface most application code and agents exercise. Anything below the supported lines is either an intentional design choice for a fast, zero-cost local emulator (✓ By design) or a candidate for a future release (⟳ Roadmap) — never a silent inaccuracy.

Legend: ✅ fully supported · ◐ accepted (stored, not strictly enforced) · ✓ by design · ⟳ on the roadmap.

| Feature | Status |
| --- | --- |
| `chat.completions` (+ streaming SSE) | ✅ Supported |
| `completions` (legacy) | ✅ Supported |
| `embeddings` (deterministic vectors, custom dims) | ✅ Supported |
| `models` list/retrieve | ✅ Supported |
| `images.generations` (url / b64_json) | ✅ Supported |
| `moderations` | ✅ Supported |
| Request inspection | ✅ Supported (parlel extension) |
| Real model inference / quality | ✓ By design — Deterministic stub output — repeatable assertions, no API spend |
| `tools` / function calling execution | ◐ Accepted, not executed (returns text) |
| Vision / audio / file uploads | ⟳ Roadmap |
| Token counts | ◐ Approximate word-based, not real BPE |
| Bearer-token validity / org scoping | ✓ By design — Any non-empty credential is accepted — no real secrets needed |
| Rate limiting (`429`) / quota | ✓ By design — Never throttles — local tests run at full speed, zero cost |

## Manifest

See `services/openai/manifest.json`:

- name: `openai`, image: `parlel/openai:1`
- port: `4747`, protocol: `http`, healthcheck: `/health`, startup ≈ 100ms
- env: `OPENAI_API_KEY`, `OPENAI_BASE_URL`