Протокол MCP и системы памяти¶

~6 минут чтения

URL: CodiLime, MarkTechPost Тип: memory / mcp / agents Дата: Февраль 2026 Сбор: Ralph Research ФАЗА 5

Предварительно: MCP vs Function Calling, Системы памяти агентов

Зачем это нужно¶

MCP (Model Context Protocol) решает проблему интеграции N моделей с M инструментами: без стандарта нужно N*M коннекторов, с MCP -- N+M. Это JSON-RPC протокол, который унифицирует доступ к ресурсам, промптам и инструментам. Для систем памяти MCP особенно важен: агент получает единый интерфейс к vector DB, graph DB и event log через один протокол, вместо написания отдельных адаптеров для каждого хранилища.

Part 1: Model Context Protocol (MCP) Explained¶

What is MCP?¶

Definition: JSON-RPC-based open standard that enables AI applications to discover and invoke tools uniformly, regardless of provider.

Problem Solved: The N×M Integration Problem

Before MCP: N models × M tools = N×M integrations
After MCP:  N models + M tools = N+M connections

MCP Architecture¶

graph LR
    CLIENT["MCP Client<br/>(AI App / LLM)"] <-->|"JSON-RPC"| SERVER["MCP Server<br/>(Data / Tool)"]
    SERVER --> R["Resources<br/>(read-only)"]
    SERVER --> P["Prompts<br/>(templates)"]
    SERVER --> T["Tools<br/>(actions)"]

    style CLIENT fill:#e8eaf6,stroke:#3f51b5
    style SERVER fill:#e8f5e9,stroke:#4caf50
    style R fill:#fff3e0,stroke:#ef6c00
    style P fill:#fff3e0,stroke:#ef6c00
    style T fill:#fff3e0,stroke:#ef6c00

3 MCP Primitives¶

Primitive	Purpose	Example
Resources	Read-only data access	File contents, DB records, API responses
Prompts	Pre-defined templates	"Summarize this document", "Analyze code"
Tools	Actions with side effects	Execute code, send email, update DB

Session Lifecycle¶

1. Initialization
   Client → Server: initialize request
   Server → Client: capabilities + version info
   Client → Server: initialized notification

2. Operation
   - List resources/prompts/tools
   - Read resources
   - Call tools
   - Get prompt templates

3. Shutdown
   Either side can close the session

Authorization¶

Standard: OAuth 2.1 - Dynamic client registration - Token-based authentication - Scope-based permissions

8 MCP Implementation Patterns¶

Pattern	Description	Use Case
Prompt Library	Curated prompt templates	Coding assistants, documentation
SaaS Wrapper	API → MCP server	Slack, GitHub, Jira integration
RAG Context	Vector DB → Resources	Knowledge base search
File System	Local files → Resources	Code analysis, document processing
Database Connector	SQL/NoSQL → Resources	Data querying, analytics
API Gateway	Multiple APIs → Unified MCP	Multi-service orchestration
Agent Tool	LLM actions → Tools	Autonomous agents
Runtime Environment	Sandboxed execution	Code execution, calculations

Popular MCP Servers (2026)¶

Server	Type	Capabilities
Filesystem	Resources	Local file access
PostgreSQL	Resources/Tools	DB queries
GitHub	Resources/Tools	Repos, issues, PRs
Slack	Tools	Messages, channels
Puppeteer	Tools	Web scraping
Memory	Resources	Persistent conversation memory
Brave Search	Tools	Web search

Code Example: MCP Client¶

from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def use_mcp_server():
    server_params = StdioServerParameters(
        command="python",
        args=["-m", "my_mcp_server"],
    )

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            # Initialize
            await session.initialize()

            # List tools
            tools = await session.list_tools()
            print(f"Available tools: {[t.name for t in tools.tools]}")

            # Call a tool
            result = await session.call_tool(
                "search_documents",
                arguments={"query": "machine learning"}
            )
            print(result.content)

MCP vs Alternatives¶

Approach	Integration Cost	Standardization	Tool Discovery
MCP	N+M	Open standard	Automatic
LangChain Tools	N×M	Framework-specific	Manual
OpenAI Functions	N×M	Provider-specific	Manual
Custom API	N×M	None	Manual

Part 2: LLM Memory Systems Comparison¶

Why Memory Matters for Agents¶

"Memory is not just storage—it's a systems problem with trade-offs across recall, consistency, latency, and cost."

Key Challenges: 1. Long conversations exceed context windows 2. Need to recall relevant past interactions 3. Must maintain consistency over time 4. Balance latency vs accuracy

6 Memory System Patterns (3 Families)¶

Family 1: Vector Memory¶

1. Plain Vector RAG

Query → Embedding → Vector Search → Top-K Chunks → Context

Aspect	Value
Latency	~50-100ms
Accuracy	Good for exact match
Weakness	Poor temporal reasoning, no relationships

Best For: Simple retrieval, FAQs

2. Tiered Vector Memory (MemGPT / Letta)

graph TD
    WM["Working Memory<br/>Current context (fits in window)"] <-->|"Core memory manager<br/>moves items between tiers"| AS["Archive Store<br/>Long-term vector DB"]

    style WM fill:#fce4ec,stroke:#c62828
    style AS fill:#e8eaf6,stroke:#3f51b5

Aspect	Value
Latency	Working: instant, Archive: 100-200ms
Accuracy	Better recall with tiering
Innovation	Self-managing memory hierarchy

Best For: Long conversations, personal assistants

Family 2: Graph Memory¶

3. Temporal Knowledge Graph (Zep/Graphiti)

Message 1 → Entity Extraction → Node Creation
Message 2 → Relation Detection → Edge Creation
    ↓
Knowledge Graph with Temporal Edges
    ↓
Query → Graph Traversal + Vector Search

Aspect	Zep/Graphiti
DMR Accuracy	94.8% vs 93.4% baseline
LongMemEval	18.5% higher accuracy
Latency	150-300ms
Innovation	Temporal edges track entity evolution

Best For: Entity-focused tasks, relationship queries, user modeling

4. Knowledge Graph RAG (GraphRAG)

Documents → Entity/Relation Extraction → Knowledge Graph
                                              ↓
                    Community Detection → Summaries
                                              ↓
                    Query → Graph Traversal + Summaries

Aspect	Value
Strength	Multi-hop reasoning, global summaries
Latency	500ms-2s (complex queries)
Weakness	Expensive graph construction

Best For: Complex reasoning, research synthesis

Family 3: Event Logs¶

5. Execution Logs/Checkpoints (ALAS, LangGraph)

Agent Action → Log Entry (input, output, state)
                   ↓
              Checkpoint Store
                   ↓
Failure → Restore from Checkpoint → Retry

Aspect	Value
Purpose	Ground truth, debugging, replay
Innovation	Deterministic replay
Weakness	Not semantic, just logs

Best For: Debugging, failure recovery, audit trails

6. Episodic Long-Term Memory

Task Episode → Pattern Extraction → Stored Pattern
                                        ↓
New Task → Pattern Match → Reuse Solution

Aspect	Value
Purpose	Cross-task learning
Innovation	Generalizes across episodes
Weakness	Complex pattern matching

Best For: Repeated task types, learning from experience

Part 3: Memory System Selection Guide¶

Decision Framework¶

Query Type?
├── Simple retrieval → Plain Vector RAG
├── Long conversations → Tiered Vector (MemGPT)
├── Entity relationships → Temporal KG (Zep)
├── Multi-hop reasoning → GraphRAG
├── Debugging/replay → Execution Logs
└── Pattern reuse → Episodic Memory

Comparison Table¶

System	Latency	Accuracy	Complexity	Best Use Case
Plain Vector RAG	50-100ms	Good	Low	Simple retrieval
Tiered Vector	0-200ms	Better	Medium	Long conversations
Temporal KG (Zep)	150-300ms	94.8%	High	Entity-focused
GraphRAG	500ms-2s	Best for multi-hop	Very High	Complex reasoning
Execution Logs	0ms (write)	100%	Low	Debugging, audit
Episodic Memory	Variable	Variable	High	Pattern reuse

Zep vs Baseline Results¶

Benchmark	Zep	Baseline	Improvement
DMR (Dialog Memory Recall)	94.8%	93.4%	+1.4%
LongMemEval	68.3%	57.6%	+18.5%
LOCOMO	89.2%	85.1%	+4.8%

Part 4: MCP + Memory Integration¶

Pattern: MCP Memory Server¶

# MCP server providing memory capabilities
from mcp.server import Server

server = Server("memory-server")

@server.list_resources()
async def list_memory_resources():
    return [
        Resource(uri="memory://conversations", name="Conversation History"),
        Resource(uri="memory://entities", name="Entity Knowledge Graph"),
        Resource(uri="memory://episodes", name="Episodic Memory"),
    ]

@server.read_resource()
async def read_memory(uri: str):
    if uri == "memory://conversations":
        return await get_conversation_history()
    elif uri == "memory://entities":
        return await get_entity_graph()
    # ...

@server.call_tool()
async def search_memory(query: str, memory_type: str):
    if memory_type == "vector":
        return await vector_search(query)
    elif memory_type == "graph":
        return await graph_search(query)
    # ...

Integration Architecture¶

graph TD
    AGENT["LLM / Agent"] -->|"MCP Protocol"| MCP["Memory MCP Server"]
    MCP --> VDB["Vector DB"]
    MCP --> GDB["Graph DB"]
    MCP --> EL["Event Log"]

    style AGENT fill:#f3e5f5,stroke:#9c27b0
    style MCP fill:#e8eaf6,stroke:#3f51b5
    style VDB fill:#e8f5e9,stroke:#4caf50
    style GDB fill:#e8f5e9,stroke:#4caf50
    style EL fill:#e8f5e9,stroke:#4caf50

Part 5: Interview-Relevant Numbers¶

MCP Statistics¶

Metric	Value
Integration reduction	N×M → N+M
Protocol	JSON-RPC 2.0
Authorization	OAuth 2.1
Primitives	3 (Resources, Prompts, Tools)

Memory System Benchmarks¶

Metric	Value
Zep DMR accuracy	94.8%
Zep LongMemEval improvement	+18.5%
Plain Vector RAG latency	50-100ms
GraphRAG latency	500ms-2s
Temporal KG latency	150-300ms

Gotchas¶

MCP -- не замена API gateway

MCP стандартизирует discovery и invocation инструментов для LLM, но не заменяет API gateway (rate limiting, auth, routing, load balancing). MCP server -- это адаптер между LLM и конкретным сервисом. Для production нужен MCP + API gateway + observability.

Resources read-only, Tools -- с side effects

Частая ошибка: использовать Tool для чтения данных или Resource для записи. Resources = безопасное чтение (файлы, DB records). Tools = действия с побочными эффектами (отправка email, запись в DB). Смешивание нарушает security model -- read-only операции не должны требовать тех же permissions что write.

OAuth 2.1 обязателен для production MCP

Без авторизации MCP server -- открытая дверь к данным и действиям. Каждый MCP server должен: (1) требовать OAuth 2.1 token, (2) проверять scope permissions для каждого Tool/Resource, (3) логировать все вызовы. Иначе любой клиент может вызвать любой инструмент.

Interview Q&A¶

Q: Какую проблему решает MCP?

Red flag: "MCP -- это новый REST API для AI"

Strong answer: "MCP решает N*M integration problem. Без стандарта: N моделей x M инструментов = N*M коннекторов. С MCP: N+M connections через единый JSON-RPC протокол. Три примитива: Resources (read-only данные), Prompts (шаблоны), Tools (действия с side effects). Аналогия: USB-C для AI -- один стандарт вместо десятков проприетарных разъёмов. Authorization через OAuth 2.1."

Q: Как бы вы реализовали MCP server для системы памяти?

Strong answer: "Три компонента: (1) Resources -- read-only доступ к memories (semantic search, temporal queries, user-specific filtering). (2) Tools -- write operations (add memory, update, delete, consolidate). (3) Backend -- unified interface к vector DB (semantic search), graph DB (relationship queries) и event log (temporal queries). Ключевое: per-user isolation через OAuth scopes, TTL на memories для GDPR compliance, conflict resolution при противоречивых memories."

Sources¶

CodiLime — "Model Context Protocol (MCP) explained" (Feb 2, 2026)
MarkTechPost — "Comparing Memory Systems for LLM Agents" (Nov 10, 2025)
MCP Specification (Anthropic)
Zep documentation
MemGPT / Letta documentation (project renamed: github.com/cpacker/MemGPT -> github.com/letta-ai/letta)