Now in public beta

Conversations that never lose the thread.

Longthread gives your AI applications infinite conversation memory through intelligent context compaction. Automatic, provider-agnostic, drop-in compatible.

Start building free See how it works

bash

$ curl https://api.llm.archi/v1/chat/completions \
  -H "Authorization: Bearer ${LONGTHREAD_API_KEY}" \
  -d '{"model": "gpt-4o", "messages": [...]}'

Drop-in compatible with

OpenAI Anthropic Any OpenAI-compatible API

How it works

Four layers of context,
one focused window.

Longthread automatically manages your conversation context through multi-level compaction. Your model always gets the most relevant information within its token window.

System Prompt

Your instructions and persona definition — always present, never compacted.

Always on

Section Summaries

Completed conversation sections are distilled into structured summaries — capturing key facts, decisions, and open questions.

Level 2

Ledger Paragraphs

Older turns compressed into concise ledger entries — the essential facts without the filler.

Level 1

Recent Raw Turns

The latest messages preserved verbatim — full nuance, full context, no summarization.

Raw

Features

Built for production
AI applications.

Zero-config compaction

Automatic, intelligent context management. No manual prompts, no chunking logic, no context window math. Just send messages.

Drop-in API

OpenAI-compatible REST API. Swap your base URL and add your API key. Existing SDKs work out of the box.

Provider-agnostic

Route to OpenAI, Anthropic, or any compatible provider. Switch models without changing your integration.

Context preservation

Facts, decisions, user intent, and open questions survive compaction. Your AI remembers what matters.

Metered billing

Pay only for the tokens you use. No per-seat charges, no upfront commitments. Start free, scale as you grow.

Edge-native

Built on Cloudflare Workers. Sub-millisecond cold starts, global distribution, automatically scales to zero.

Pricing

Start free.
Scale when ready.

Simple, transparent pricing based on the tokens you actually use.

Free

For prototyping and side projects

$0 /month

100K tokens/month
All providers
Level 1 compaction
Community support

Get started

Pro

For production applications

$29 /month + usage

5M tokens/month included
All providers + priority routing
Level 1 & 2 compaction
Email support
Usage dashboard

Start free trial

Enterprise

For teams with custom needs

Custom

Unlimited tokens
Dedicated infrastructure
All compaction levels
SLA guarantee
Custom integrations
Dedicated support

Ready to go infinite?

Get your API key in 30 seconds. No credit card required.

# Install the SDK
npm install @longthread/sdk

# Or just use curl
export LONGTHREAD_API_KEY="lt_..."
curl https://api.llm.archi/v1/chat/completions \
  -H "Authorization: Bearer $LONGTHREAD_API_KEY"

Request API key Talk to us