Перейти к содержанию

Протокол MCP и системы памяти

~6 минут чтения

URL: CodiLime, MarkTechPost Тип: memory / mcp / agents Дата: Февраль 2026 Сбор: Ralph Research ФАЗА 5


Предварительно: MCP vs Function Calling, Системы памяти агентов

Зачем это нужно

MCP (Model Context Protocol) решает проблему интеграции N моделей с M инструментами: без стандарта нужно N*M коннекторов, с MCP -- N+M. Это JSON-RPC протокол, который унифицирует доступ к ресурсам, промптам и инструментам. Для систем памяти MCP особенно важен: агент получает единый интерфейс к vector DB, graph DB и event log через один протокол, вместо написания отдельных адаптеров для каждого хранилища.

Part 1: Model Context Protocol (MCP) Explained

What is MCP?

Definition: JSON-RPC-based open standard that enables AI applications to discover and invoke tools uniformly, regardless of provider.

Problem Solved: The N×M Integration Problem

Before MCP: N models × M tools = N×M integrations
After MCP:  N models + M tools = N+M connections

MCP Architecture

graph LR
    CLIENT["MCP Client<br/>(AI App / LLM)"] <-->|"JSON-RPC"| SERVER["MCP Server<br/>(Data / Tool)"]
    SERVER --> R["Resources<br/>(read-only)"]
    SERVER --> P["Prompts<br/>(templates)"]
    SERVER --> T["Tools<br/>(actions)"]

    style CLIENT fill:#e8eaf6,stroke:#3f51b5
    style SERVER fill:#e8f5e9,stroke:#4caf50
    style R fill:#fff3e0,stroke:#ef6c00
    style P fill:#fff3e0,stroke:#ef6c00
    style T fill:#fff3e0,stroke:#ef6c00

3 MCP Primitives

Primitive Purpose Example
Resources Read-only data access File contents, DB records, API responses
Prompts Pre-defined templates "Summarize this document", "Analyze code"
Tools Actions with side effects Execute code, send email, update DB

Session Lifecycle

1. Initialization
   Client → Server: initialize request
   Server → Client: capabilities + version info
   Client → Server: initialized notification

2. Operation
   - List resources/prompts/tools
   - Read resources
   - Call tools
   - Get prompt templates

3. Shutdown
   Either side can close the session

Authorization

Standard: OAuth 2.1 - Dynamic client registration - Token-based authentication - Scope-based permissions

8 MCP Implementation Patterns

Pattern Description Use Case
Prompt Library Curated prompt templates Coding assistants, documentation
SaaS Wrapper API → MCP server Slack, GitHub, Jira integration
RAG Context Vector DB → Resources Knowledge base search
File System Local files → Resources Code analysis, document processing
Database Connector SQL/NoSQL → Resources Data querying, analytics
API Gateway Multiple APIs → Unified MCP Multi-service orchestration
Agent Tool LLM actions → Tools Autonomous agents
Runtime Environment Sandboxed execution Code execution, calculations
Server Type Capabilities
Filesystem Resources Local file access
PostgreSQL Resources/Tools DB queries
GitHub Resources/Tools Repos, issues, PRs
Slack Tools Messages, channels
Puppeteer Tools Web scraping
Memory Resources Persistent conversation memory
Brave Search Tools Web search

Code Example: MCP Client

from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def use_mcp_server():
    server_params = StdioServerParameters(
        command="python",
        args=["-m", "my_mcp_server"],
    )

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            # Initialize
            await session.initialize()

            # List tools
            tools = await session.list_tools()
            print(f"Available tools: {[t.name for t in tools.tools]}")

            # Call a tool
            result = await session.call_tool(
                "search_documents",
                arguments={"query": "machine learning"}
            )
            print(result.content)

MCP vs Alternatives

Approach Integration Cost Standardization Tool Discovery
MCP N+M Open standard Automatic
LangChain Tools N×M Framework-specific Manual
OpenAI Functions N×M Provider-specific Manual
Custom API N×M None Manual

Part 2: LLM Memory Systems Comparison

Why Memory Matters for Agents

"Memory is not just storage—it's a systems problem with trade-offs across recall, consistency, latency, and cost."

Key Challenges: 1. Long conversations exceed context windows 2. Need to recall relevant past interactions 3. Must maintain consistency over time 4. Balance latency vs accuracy

6 Memory System Patterns (3 Families)

Family 1: Vector Memory

1. Plain Vector RAG

Query → Embedding → Vector Search → Top-K Chunks → Context

Aspect Value
Latency ~50-100ms
Accuracy Good for exact match
Weakness Poor temporal reasoning, no relationships

Best For: Simple retrieval, FAQs

2. Tiered Vector Memory (MemGPT / Letta)

graph TD
    WM["Working Memory<br/>Current context (fits in window)"] <-->|"Core memory manager<br/>moves items between tiers"| AS["Archive Store<br/>Long-term vector DB"]

    style WM fill:#fce4ec,stroke:#c62828
    style AS fill:#e8eaf6,stroke:#3f51b5

Aspect Value
Latency Working: instant, Archive: 100-200ms
Accuracy Better recall with tiering
Innovation Self-managing memory hierarchy

Best For: Long conversations, personal assistants


Family 2: Graph Memory

3. Temporal Knowledge Graph (Zep/Graphiti)

Message 1 → Entity Extraction → Node Creation
Message 2 → Relation Detection → Edge Creation
Knowledge Graph with Temporal Edges
Query → Graph Traversal + Vector Search
Aspect Zep/Graphiti
DMR Accuracy 94.8% vs 93.4% baseline
LongMemEval 18.5% higher accuracy
Latency 150-300ms
Innovation Temporal edges track entity evolution

Best For: Entity-focused tasks, relationship queries, user modeling

4. Knowledge Graph RAG (GraphRAG)

Documents → Entity/Relation Extraction → Knowledge Graph
                    Community Detection → Summaries
                    Query → Graph Traversal + Summaries
Aspect Value
Strength Multi-hop reasoning, global summaries
Latency 500ms-2s (complex queries)
Weakness Expensive graph construction

Best For: Complex reasoning, research synthesis


Family 3: Event Logs

5. Execution Logs/Checkpoints (ALAS, LangGraph)

Agent Action → Log Entry (input, output, state)
              Checkpoint Store
Failure → Restore from Checkpoint → Retry
Aspect Value
Purpose Ground truth, debugging, replay
Innovation Deterministic replay
Weakness Not semantic, just logs

Best For: Debugging, failure recovery, audit trails

6. Episodic Long-Term Memory

Task Episode → Pattern Extraction → Stored Pattern
New Task → Pattern Match → Reuse Solution
Aspect Value
Purpose Cross-task learning
Innovation Generalizes across episodes
Weakness Complex pattern matching

Best For: Repeated task types, learning from experience


Part 3: Memory System Selection Guide

Decision Framework

Query Type?
├── Simple retrieval → Plain Vector RAG
├── Long conversations → Tiered Vector (MemGPT)
├── Entity relationships → Temporal KG (Zep)
├── Multi-hop reasoning → GraphRAG
├── Debugging/replay → Execution Logs
└── Pattern reuse → Episodic Memory

Comparison Table

System Latency Accuracy Complexity Best Use Case
Plain Vector RAG 50-100ms Good Low Simple retrieval
Tiered Vector 0-200ms Better Medium Long conversations
Temporal KG (Zep) 150-300ms 94.8% High Entity-focused
GraphRAG 500ms-2s Best for multi-hop Very High Complex reasoning
Execution Logs 0ms (write) 100% Low Debugging, audit
Episodic Memory Variable Variable High Pattern reuse

Zep vs Baseline Results

Benchmark Zep Baseline Improvement
DMR (Dialog Memory Recall) 94.8% 93.4% +1.4%
LongMemEval 68.3% 57.6% +18.5%
LOCOMO 89.2% 85.1% +4.8%

Part 4: MCP + Memory Integration

Pattern: MCP Memory Server

# MCP server providing memory capabilities
from mcp.server import Server

server = Server("memory-server")

@server.list_resources()
async def list_memory_resources():
    return [
        Resource(uri="memory://conversations", name="Conversation History"),
        Resource(uri="memory://entities", name="Entity Knowledge Graph"),
        Resource(uri="memory://episodes", name="Episodic Memory"),
    ]

@server.read_resource()
async def read_memory(uri: str):
    if uri == "memory://conversations":
        return await get_conversation_history()
    elif uri == "memory://entities":
        return await get_entity_graph()
    # ...

@server.call_tool()
async def search_memory(query: str, memory_type: str):
    if memory_type == "vector":
        return await vector_search(query)
    elif memory_type == "graph":
        return await graph_search(query)
    # ...

Integration Architecture

graph TD
    AGENT["LLM / Agent"] -->|"MCP Protocol"| MCP["Memory MCP Server"]
    MCP --> VDB["Vector DB"]
    MCP --> GDB["Graph DB"]
    MCP --> EL["Event Log"]

    style AGENT fill:#f3e5f5,stroke:#9c27b0
    style MCP fill:#e8eaf6,stroke:#3f51b5
    style VDB fill:#e8f5e9,stroke:#4caf50
    style GDB fill:#e8f5e9,stroke:#4caf50
    style EL fill:#e8f5e9,stroke:#4caf50

Part 5: Interview-Relevant Numbers

MCP Statistics

Metric Value
Integration reduction N×M → N+M
Protocol JSON-RPC 2.0
Authorization OAuth 2.1
Primitives 3 (Resources, Prompts, Tools)

Memory System Benchmarks

Metric Value
Zep DMR accuracy 94.8%
Zep LongMemEval improvement +18.5%
Plain Vector RAG latency 50-100ms
GraphRAG latency 500ms-2s
Temporal KG latency 150-300ms

Gotchas

MCP -- не замена API gateway

MCP стандартизирует discovery и invocation инструментов для LLM, но не заменяет API gateway (rate limiting, auth, routing, load balancing). MCP server -- это адаптер между LLM и конкретным сервисом. Для production нужен MCP + API gateway + observability.

Resources read-only, Tools -- с side effects

Частая ошибка: использовать Tool для чтения данных или Resource для записи. Resources = безопасное чтение (файлы, DB records). Tools = действия с побочными эффектами (отправка email, запись в DB). Смешивание нарушает security model -- read-only операции не должны требовать тех же permissions что write.

OAuth 2.1 обязателен для production MCP

Без авторизации MCP server -- открытая дверь к данным и действиям. Каждый MCP server должен: (1) требовать OAuth 2.1 token, (2) проверять scope permissions для каждого Tool/Resource, (3) логировать все вызовы. Иначе любой клиент может вызвать любой инструмент.


Interview Q&A

Q: Какую проблему решает MCP?

❌ Red flag: "MCP -- это новый REST API для AI"

✅ Strong answer: "MCP решает N*M integration problem. Без стандарта: N моделей x M инструментов = N*M коннекторов. С MCP: N+M connections через единый JSON-RPC протокол. Три примитива: Resources (read-only данные), Prompts (шаблоны), Tools (действия с side effects). Аналогия: USB-C для AI -- один стандарт вместо десятков проприетарных разъёмов. Authorization через OAuth 2.1."

Q: Как бы вы реализовали MCP server для системы памяти?

✅ Strong answer: "Три компонента: (1) Resources -- read-only доступ к memories (semantic search, temporal queries, user-specific filtering). (2) Tools -- write operations (add memory, update, delete, consolidate). (3) Backend -- unified interface к vector DB (semantic search), graph DB (relationship queries) и event log (temporal queries). Ключевое: per-user isolation через OAuth scopes, TTL на memories для GDPR compliance, conflict resolution при противоречивых memories."


Sources

  1. CodiLime — "Model Context Protocol (MCP) explained" (Feb 2, 2026)
  2. MarkTechPost — "Comparing Memory Systems for LLM Agents" (Nov 10, 2025)
  3. MCP Specification (Anthropic)
  4. Zep documentation
  5. MemGPT / Letta documentation (project renamed: github.com/cpacker/MemGPT -> github.com/letta-ai/letta)