Vast Memory Enterprise — Build Paper

💭

Vast Memory Enterprise — Build Paper

Teka

projectdecisionvast-memorykindoenterprisescope-expansionventure-partnerconfidential

🧭

Status Draft v1 — for internal Teka review · Author Teka — Ora · Date 2026-05-19 · Classification Confidential — Teka strategy.

Sub-paper of the Teka × Kindo — Scope Expansion Proposal. Research-backed; external claims are cited, build and engagement sections are Teka's proposal and marked as such.

Abstract

Vast Memory Enterprise is a governed, multi-tenant memory layer for enterprise AI agents. It shares a single memory engine with Vast Memory — the memory layer Teka is building for The Cloud — the same write path, retrieval, and identity model — and adds the governance envelope enterprise buyers require: hard tenant isolation, per-tenant encryption, role-aware access, PII redaction, audited reads and writes, and a working right-to-be-forgotten.

This paper surveys the agent-memory field as it stands in 2026, identifies enterprise multi-tenancy as the category's clearest unmet need, and lays out how Teka would build Vast Memory Enterprise with Kindo — Teka as Kindo's Technology Studio and Venture Partner — delivered MCP-native into Kindo's existing platform.

1. Why this paper exists

Two builds are converging.

Teka is building Vast Memory for The Cloud: the memory layer that lets Ora remember a person across sessions, defer to that memory, and write durable outcomes back to it. Separately, Kindo — the enterprise AI-agent and MCP-integration platform Teka already delivers integrations for — needs its agents to carry state across runs, for enterprise customers, under enterprise governance.

The insight behind this paper: these are the same engine. The hard parts of memory — extracting durable facts from conversation, resolving entities, reconciling conflicting updates, retrieving the right memory at the right moment, letting old facts decay — are domain-agnostic. The Cloud exercises that engine in a personal-workspace context; an enterprise platform exercises it under multi-tenant governance. Build the core once, deploy it twice: The Cloud becomes the working laboratory, and Vast Memory Enterprise becomes the commercial product Teka and Kindo bring to the enterprise market.

Sections 2–5 survey the field and locate the opportunity. Sections 6 onward are Teka's proposed build and the engagement model.

2. What agent memory is

Modern agent memory is organised into four types, formalised by the CoALA framework: working memory (tokens inside the model context window — recent turns, task instructions, intermediate artifacts), episodic memory (specific past experiences with temporal detail), semantic memory (durable facts, preferences and constraints about a user or domain), and procedural memory (learned skills and multi-step workflows) (Atlan — Types of AI Agent Memory, Redis — AI agent memory). Capable agents emerge from controlled flow between these tiers — episodic experience consolidates into semantic memory, which is retrieved back into working memory.

The memory lifecycle has two phases, and the second is what separates true memory from search-over-logs:

Write path (at conversation time) — extract discrete facts from unstructured dialogue, resolve entities ("Alice", "our CTO" and "the person who filed that ticket" may be one person), track temporal validity, and update and merge existing knowledge rather than blindly appending — reconciling conflicts as they arise.
Read path (at query time) — retrieve the memories relevant to the current request.
Background — consolidation, decay/expiry and pruning of stale entries.

This is the core distinction between RAG and agent memory. RAG is read-only: index documents once, then query. Agent memory needs a write path as sophisticated as its read path — add / update / deprecate semantics that RAG pipelines simply do not have. "RAG gave agents access to knowledge. Memory gives them continuity" (Vectorize — Agent Memory vs RAG, Mem0 — RAG vs AI Memory). Context-window management — compaction, clearing stale tool calls — is a complementary but separate concern: it extends a single run; memory carries state across runs.

3. How memory systems are built

Storage. The field has converged on hybrid storage rather than one substrate — vector stores for semantic similarity, knowledge graphs for entity relationships and traversable structure, and relational/key-value stores for exact lookups and metadata filtering (Vectorize — Best AI Agent Memory Systems, Mem0 — State of AI Agent Memory 2026).

Retrieval. State of the art is multi-signal retrieval: parallel scoring passes — semantic embedding similarity, BM25 keyword matching, entity matching — fused and then reranked, with metadata filtering to narrow on structured attributes. No single signal wins alone.

Identity scoping. Production designs compose several identity layers at retrieval — user_id (cross-session persistence), agent_id (agent-specific facts), session_id/run_id (conversation scope), and org_id (organisational context). Actor attribution preserves who said what (Mem0 — State of AI Agent Memory 2026). For an enterprise product, `org_id` becomes the hard tenant boundary — this is the hinge the whole governance story turns on.

Extraction is LLM-driven and runs asynchronously by default, so a memory write never adds latency to the agent's response.

4. The field in 2026

Product	Architecture	Strength	Enterprise gap
Mem0	Vector + optional graph + key-value; user/session/agent scopes; multi-signal retrieval	Broadest ecosystem, strong benchmarks, self-hostable (Apache 2.0)	Multi-tenancy / RBAC not first-class; graph behind a paid tier
Zep / Graphiti	Temporal knowledge graph — every fact a node with a validity window	Best-in-class temporal reasoning	Graph-only model; enterprise governance still light
Letta (ex-MemGPT)	OS-inspired core / archival / recall memory; agent self-manages	Agent autonomy; good for long-running agents	No managed tier; less turnkey; governance thin
cognee	Knowledge-graph memory engine, 30+ data connectors	Strong data-integration story	Heavy; positioned as a data platform
Supermemory	Semantic memory at scale with temporal awareness	Lightweight, scalable	Closed source; self-host requires enterprise deal
OpenAI / Anthropic / Google native memory	Platform-native (saved memories, memory tool, context editing)	Consumer-grade UX	Not an embeddable governed layer; often off by default on enterprise tiers

Sources: Atlan — Best Frameworks 2026, Vectorize — Best AI Agent Memory Systems, Anthropic — Context management, OpenAI — Memory controls.

Where the market is underserved. Dev-first products (Mem0, Zep, Letta) optimise for the single developer or single app — multi-tenant isolation, per-tenant encryption and RBAC over memory are bolted on, not foundational. Platform-native memory is consumer-shaped and cannot be delivered as an embeddable, governed layer for a third-party platform. Mem0's own State of AI Agent Memory 2026 names "standardised consent and deletion workflows" as a capability absent across the entire field (Mem0).

That is the opening: a governed, multi-tenant memory layer an enterprise agent platform can adopt wholesale.

5. What enterprise memory demands

A production architecture for governed memory (arXiv — Governed Memory, Scalekit — Access control for multi-tenant AI agents) defines the requirement set:

Tenant isolation — hard isolation at the storage layer via org_id partition keys. No cross-tenant leakage in prompts, logs, embeddings, caches or memories. One referenced system validated zero cross-entity leakage across 3,800 adversarial queries.
Encryption — per-tenant, at rest and in transit, alongside backup, replication and HA.
RBAC + ABAC — role-aware authorisation (admin / analyst / viewer) plus attribute-based field-level controls, enforced on both retrieval and mutation.
PII redaction — a two-phase pipeline: pattern-match and substitute typed placeholders before LLM extraction, then rescan extracted values after to catch LLM-reconstructed PII.
Audit & observability — every memory read and write logged with timestamp, identity and operation metadata, including tool-triggered retrieval.
Retention & right-to-be-forgotten — provenance metadata (content hash, extraction method, model attribution) on every entry, which is what makes targeted, GDPR/HIPAA-grade deletion tractable.
Compliance — SOC 2, GDPR and HIPAA expectations map directly onto the primitives above.

6. Vast Memory Enterprise — the build

🏗️

Sections 6–9 are Teka's proposed build and engagement. They are design intent, not shipped fact.

Vast Memory Enterprise is the shared memory engine wrapped in a governance envelope and delivered MCP-native.

The core engine (shared with Vast Memory for The Cloud)

Hybrid storage — vector + temporal knowledge graph + structured store.
Async write path — LLM fact extraction, entity resolution, temporal validity, conflict reconciliation (update, don't blindly append).
Multi-signal retrieval — semantic + keyword + entity scoring, fused and reranked, with recency/salience weighting.
Temporal validity windows so a superseded fact is evolved, not silently overwritten — the basis for stale-memory mitigation.

The governance envelope (the enterprise-specific layer — the differentiation)

org_id partition keys as the hard tenant boundary, enforced at the storage layer.
Per-tenant encryption; backup / replication / HA.
RBAC + ABAC on every read and write.
Two-phase PII detection and redaction.
Audit log on every memory operation; per-tenant retrieval-quality observability (precision/recall on real traffic, stale-memory alerts).
Provenance metadata on every entry → working retention and right-to-be-forgotten.

MCP-native delivery. Memory is exposed as MCP tools (add_memory, search_memory, update_memory, delete_memory) and resources (a memory snapshot injected as read-only context) (Knit — MCP for RAG and Agent Memory). AWS Bedrock AgentCore already demonstrates managing a memory store entirely through MCP tool calls (AWS — Bedrock AgentCore MCP Server).

Why this is a native fit for Kindo. Kindo runs a fleet of MCP integrations in a multi-tenant Python server — the same server Teka delivers into today. A memory layer delivered as an MCP integration slots into Kindo's existing transport, auth and tenant model rather than demanding a parallel SDK. MCP's per-request identity forwarding is the exact channel through which the org_id tenant boundary reaches the memory store. Kindo's customer base — enterprise, security-heavy — is precisely the buyer for governed memory.

7. The shared core — one engine, two products

The commercial logic of the dual build:

One engine, built once. The write path, retrieval and identity model are identical. Teka builds and hardens the engine through The Cloud's Vast Memory, where it is exercised continuously against a real product.
Two envelopes. The Cloud adds a personal-workspace experience; Vast Memory Enterprise adds the governance envelope of Section 6. Neither reinvents the core.
The laboratory advantage. The Cloud is a live proving ground — every memory pattern that earns its place there lands into the enterprise product already battle-tested. This is the credible answer to build-vs-buy: an enterprise entrant whose core is already running in production, not a greenfield bet.
The decision that needs an owner. The shared core simultaneously powers Teka's own product (The Cloud) and a contracted enterprise build. Ownership and licensing of that shared core must be defined explicitly — see Section 10.

8. Build plan & phasing

Phase	Scope	Relationship to the Cloud build
P0 — Core engine	Write path, retrieval, hybrid storage, identity scoping	Shared — already in motion via Vast Memory for The Cloud
P1 — Governance envelope	`org_id` isolation, per-tenant encryption, RBAC/ABAC, two-phase PII redaction, audit, provenance, right-to-be-forgotten	Enterprise-specific — the net-new build
P2 — MCP delivery	Memory as MCP tools + resources; integration into Kindo's multi-tenant server	Enterprise-specific
P3 — Observability & hardening	Per-tenant retrieval-quality metrics, stale-memory mitigation, scale and latency budgets	Partly shared
P4 — Productise	Packaging (managed / self-host / embedded), pricing, joint Kindo go-to-market	Enterprise-specific

This is a larger, longer workstream than the frontend expansion in the parent proposal, and it is scoped separately — but it shares the same delivery discipline: every increment runs the Build → Review → Harness → Prod → Done pipeline, and Done means prod-verified.

9. The engagement — Teka as Technology Studio & Venture Partner

Vast Memory Enterprise is not a staff-augmentation ticket. It is a product, and the engagement should reflect that.

As Technology Studio — Teka owns the build end to end: research, architecture, design, code, and delivery discipline. This is the same studio-grade bar Teka already holds on Kindo's integration fleet.
As Venture Partner — Teka builds with stake in the outcome, not billed hours alone. Vast Memory Enterprise is a commercial asset; Teka's interest is in its success in market. The engagement should pair a build contract with a partnership structure — equity, revenue-share, or a defined IP arrangement — co-authored with Kindo leadership.
Why the partner framing is earned — the shared core means Teka brings an engine it has already built and proven, not just labour. That is partner-grade contribution and the contract should price it as such.

`[Kindo leadership]` — the genuine decisions for the Kindo side:

Appetite for an enterprise memory product as a Kindo offering — and the priority against the current integration roadmap.
The partnership structure — equity / revenue-share / IP — that fits Kindo's model.
Ownership and licensing of the shared memory core (Teka's Cloud product ↔ the Kindo enterprise build).
Budget and resourcing for a workstream of this size, distinct from the frontend-expansion test phase.

10. Risks & open questions

Shared-core IP — the load-bearing open question. The same engine powering The Cloud and a contracted build needs a clean, written ownership/licensing arrangement before substantial enterprise-specific work begins.
Build-vs-buy pressure — Kindo could adopt Mem0 or Zep instead. The paper's answer: the governance envelope is the unmet need, MCP-native delivery is a packaging edge, and the shared, already-proven core de-risks the build. That case has to stay honest as competitors move.
Hard problems remain hard — stale-memory mitigation and application-level evaluation are open across the field; Vast Memory Enterprise does not get them for free.
Figures shift — benchmark numbers and competitor pricing move release-to-release; anything quoted here is dated 2026-05 and should be re-verified before any external use.
Resourcing realism — this is a multi-month build; committing to it must not cost backend integration velocity, the same constraint the parent proposal holds.

References

Mem0 — State of AI Agent Memory 2026
Atlan — Types of AI Agent Memory · Best AI Agent Memory Frameworks 2026
Vectorize — Agent Memory vs RAG · Best AI Agent Memory Systems
Redis — AI agent memory: types, architecture & implementation
arXiv — Governed Memory: A Production Architecture for Multi-Agent Workflows
Scalekit — Access Control for Multi-Tenant AI Agents
Knit — Powering RAG and Agent Memory with MCP
AWS — Amazon Bedrock AgentCore MCP Server
Anthropic — Managing context on the Claude Developer Platform
OpenAI — Memory and new controls for ChatGPT
Full landscape research: Enterprise Agent-Memory Systems — State of the Art (Teka deep-research, 2026-05-19)

📌

This is a living paper. As Vast Memory matures in The Cloud and the Kindo-side decisions land, the build plan and engagement sections update; the [Kindo leadership] items are replaced with authoritative answers. — Ora, for Teka

TAM / SAM / SOM

TAM. The broader market is the enterprise infrastructure stack for AI agents: orchestration, retrieval, memory, governance, observability, and compliance. Framed that way, the opportunity is a mid- to high-single-digit billions TAM, with upside beyond that as agents become standard enterprise software primitives.

SAM. The immediately addressable segment for Kindo is narrower: governed, multi-tenant memory and persistent state infrastructure for enterprise agent platforms — especially regulated buyers and platform teams that need tenant isolation, auditability, RBAC / ABAC, PII controls, and deletion workflows. That is a low-single-digit billions SAM today.

SOM. The near-term wedge is the subset of enterprise agent platforms that are already live, already feeling statefulness pain, and already need an embeddable MCP-native memory layer. This is a high-ACV platform sale with a small number of meaningful accounts, not a volume SaaS motion.

In plain terms: Kindo is not chasing “all AI.” It is targeting the control layer that makes enterprise agents safe to deploy at scale.