Build AI Agents That Remember: OpenSearch Developer Tier

OpenSearch isn't just a search engine anymore. Recent releases moved it into AI infrastructure: agentic memory built in, Better Binary Quantization (BBQ) compressing vectors 32x, token-usage tracking, and a one-command Observability Stack. A stack for building practical AI applications, not just indexing.

The catch is that production-sized OpenSearch clusters aren't where you want to prototype. Deploying a production-sized OpenSearch cluster to test how agentic memory changes a multi-turn conversation eval is an overkill at the experiment phase.

That's what the new OpenSearch Developer tier on Aiven is for: a fully managed environment sized for prototyping.

What you get in the Developer tier for $40/month

Single-node cluster (2 vCPU, 4 GB RAM): Index documents, test search relevance, run small-scale analytics, and prototype vector search for RAG on real OpenSearch.
30 GB of disk storage: Room for millions of small documents, weeks of application logs, or a sizeable vector index, enough headroom to test against realistic data.
Always-on availability: Your cluster stays up between sessions. No power-off on inactivity, so your project keeps working when you walk away.
Aiven and third-party integrations: Pipe data directly from other Aiven services (for example, Apache Kafka) into OpenSearch indexes to power log analytics, observability pipelines, or full-text search, and export metrics to Prometheus, Datadog, or other observability tools.
Basic tier support: Get help via email support, best-effort same or next business day.

Building a personal assistant agent

Take a personal assistant agent example: one you're building for yourself, or one you're shipping to your users. The Developer tier lets you deploy a cluster, point an agent at it, and see what persistent memory does to the quality of the experience. Without it, every Monday morning the agent walks in cold. It re-asks for the timezone, re-learns that you don't take meetings before 10am, re-discovers which project "the May launch" refers to. The model is fine, but the interaction feels broken because the context resets at the session boundary.

Persistent memory closes that gap and as of OpenSearch 3.3, you don't have to assemble it yourself. Agentic memory is built into the ml-commons plugin, exposed through a REST API, and works with any agent framework.

The core abstraction is a memory container: a logical home for everything an agent needs to remember about a user, a tenant, or a workflow. Each container holds four kinds of state, all managed by OpenSearch:

Sessions track each conversation as a distinct interaction.
Working memory stores the active turns and current context.
Long-term memory is what survives across sessions — extracted facts, preferences, and summaries.
History keeps an audit trail of every memory operation.

The interesting work happens in long-term memory. OpenSearch ships three built-in extraction strategies that an LLM applies to raw conversations on your behalf:

USER_PREFERENCE captures explicit preferences ("no meetings before 10am", "always cc Jane on marketing threads").
SEMANTIC pulls out facts the user mentioned in passing ("Sara leads the Stockholm office").
SUMMARY keeps a running, session-scoped summary so the agent doesn't have to re-read the full transcript.

You choose which strategies to enable when you create the container, and you define namespaces (arbitrary JSON keys like user_id) to scope memory access. Embedding, indexing, extraction, and retrieval all happen inside OpenSearch.

After registering an embedding model and an LLM in ml-commons, create a memory container with the strategies and namespace keys your agent needs by sending the following request:

POST /_plugins/_ml/memory_containers/_create
{
  "name": "personal-assistant",
  "description": "Memory container for the personal assistant",
  "configuration": {
    "embedding_model_type": "TEXT_EMBEDDING",
    "embedding_model_id": "<your-embedding-model-id>",
    "llm_id": "<your-llm-id>",
    "strategies": [
      { "type": "USER_PREFERENCE", "namespace": ["user_id"] },
      { "type": "SEMANTIC",        "namespace": ["user_id"] },
      { "type": "SUMMARY",         "namespace": ["user_id", "session_id"] }
    ]
  }
}
Loading code...

The Aiven Developer tier is sized for exactly this kind of work: prototype a personal agent and its memory layer, run it against real conversation traces, watch how the agent's behaviour changes, and scale up only once you have real usage to size against. The interesting thing isn't the indexing code. It's that with each session, the agent feels like one that actually knows you.

Get building with Aiven's OpenSearch Developer tier.

Stay updated with Aiven

Subscribe for the latest news and insights on open source, Aiven offerings, and more.

Subscribe to RSS

Table of contents

What you get in the Developer tier for $40/month
Building a personal assistant agent

Building Agents that Remember: The OpenSearch Developer Tier

What you get in the Developer tier for $40/month

Building a personal assistant agent

Stay updated with Aiven

Related resources

Deterministic Simulation Testing in Diskless Apache Kafka

Why don't Kafka and Iceberg get along?

The Hitchhiker’s guide to Diskless Kafka