Function Calling & Tool Use 2025-2026¶
~5 минут чтения
LLM Tool Use, Function Calling, Multi-Agent Orchestration, ReAct Pattern Источники: ArXiv 2024-2025, OpenAI Docs, LangChain/AutoGen Guides
1. Function Calling Overview¶
1.1 Что такое Function Calling¶
Function Calling — механизм, позволяющий LLM отвечать структурированным JSON с именами функций и параметрами, что даёт возможность взаимодействовать с внешними системами.
graph LR
Q["User Query"] --> LLM["LLM"]
LLM --> FC["Function Call (JSON)"]
FC --> EX["Executor"]
EX --> RES["Function Result"]
RES --> LLM
LLM --> FINAL["Final Response"]
style LLM fill:#e8eaf6,stroke:#3f51b5
style EX fill:#fff3e0,stroke:#ef6c00
style FINAL fill:#e8f5e9,stroke:#4caf50
1.2 Пример Function Call¶
# Function definition
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and country, e.g., 'Paris, France'"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["location"]
}
}
}]
# LLM response with function call
response = {
"tool_calls": [{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": '{"location": "Paris, France", "unit": "celsius"}'
}
}]
}
2. ReAct: Reasoning + Acting¶
2.1 ReAct Paradigm (Yao et al., 2022)¶
ReAct = Reasoning + Acting — фреймворк, чередующий явный chain-of-thought reasoning с выполнением external actions.
graph TD
T1["Thought: I need to find<br/>the population of Paris"]
T1 --> A1["Action: Search<br/>Paris population 2024"]
A1 --> O1["Observation: Paris has<br/>2.1M inhabitants..."]
O1 --> T2["Thought: Now I have<br/>the information needed"]
T2 --> ANS["Answer: Paris has ~2.1M<br/>inhabitants"]
style T1 fill:#e8eaf6,stroke:#3f51b5
style A1 fill:#fff3e0,stroke:#ef6c00
style O1 fill:#e8f5e9,stroke:#4caf50
style T2 fill:#e8eaf6,stroke:#3f51b5
style ANS fill:#fce4ec,stroke:#c62828
2.2 ReAct vs Chain-of-Thought¶
| Approach | Strengths | Weaknesses |
|---|---|---|
| Chain-of-Thought | Pure reasoning, no external access | Hallucination, error propagation |
| Act-only | Grounded in reality | No reasoning trace |
| ReAct | Best of both, interpretable | More tokens, slower |
2.3 ReAct Results (Original Paper)¶
| Benchmark | ReAct | CoT | Improvement |
|---|---|---|---|
| HotpotQA | 27.4% | 29.4% | Better grounding |
| Fever | 56.3% | 56.7% | Less hallucination |
| ALFWorld | +34% | baseline | Interactive tasks |
| WebShop | +10% | baseline | Decision making |
3. Multi-LLM Agent Framework¶
3.1 Planner-Caller-Summarizer Pattern¶
Key Insight: Small LLMs struggle with all tool-use capabilities. Solution: Decompose into specialized roles.
graph LR
P["PLANNER<br/>Task Planning"] --> C["CALLER<br/>Tool Invocation"]
C --> S["SUMMARIZER<br/>Result Synthesis"]
style P fill:#e8eaf6,stroke:#3f51b5
style C fill:#fff3e0,stroke:#ef6c00
style S fill:#e8f5e9,stroke:#4caf50
Each role = single LLM focused on one capability.
3.2 Two-Stage Training¶
Stage 1: Pre-train backbone on full dataset (comprehensive understanding) Stage 2: Specialize each component on respective sub-tasks
3.3 Results (Small LLMs Are Weak Tool Learners)¶
| Approach | Success Rate | Efficiency |
|---|---|---|
| Single LLM (7B) | 45% | Low |
| Multi-LLM (3x 7B) | 68% | Higher |
| Single LLM (70B) | 72% | Highest cost |
4. Agent Frameworks Comparison¶
4.1 Framework Overview (2025)¶
| Framework | Strength | Best For |
|---|---|---|
| LangChain | Swiss Army Knife | Production apps |
| LangGraph | State machines | Complex workflows |
| AutoGen | Multi-agent conversations | Enterprise |
| CrewAI | Role-based agents | Team simulation |
| AG2 | Latest AutoGen | Modern patterns |
4.2 LangChain vs AutoGen vs CrewAI¶
| Feature | LangChain | AutoGen | CrewAI |
|---|---|---|---|
| Primary Focus | Tool orchestration | Agent conversations | Role-based teams |
| Learning Curve | Medium | Steep | Low |
| Multi-Agent | Via LangGraph | Native | Native |
| Human-in-Loop | Supported | Built-in | Limited |
| Production Ready | Yes | Yes | Growing |
4.3 AutoGen Orchestration Patterns¶
Sequential Pattern:
Hierarchical Pattern:
5. STRIDE: When to Use Agentic AI¶
5.1 Three Modalities¶
| Modality | Description | When to Use |
|---|---|---|
| Direct LLM Call | Single inference | Static, simple tasks |
| Guided AI Assistant | Structured help | Semi-complex, predictable |
| Full Agentic AI | Autonomous goal pursuit | Dynamic, evolving context |
5.2 STRIDE Framework Components¶
- Task Decomposition — Break down complexity
- Dynamism Attribution — How much does context change?
- Self-Reflection Requirement — Does task need iterative improvement?
5.3 Agentic Suitability Score¶
Results: - 92% accuracy in modality selection - 45% reduction in unnecessary agent deployments - 37% cost reduction
6. Tool-Calling vs ReAct Agents¶
6.1 Architecture Comparison¶
Tool-Calling Agent:
ReAct Agent:
6.2 When to Use Which¶
| Scenario | Recommended |
|---|---|
| Deterministic tool set | Tool-Calling |
| Unknown tools at runtime | ReAct |
| Need reasoning transparency | ReAct |
| Fast execution required | Tool-Calling |
| Complex multi-step tasks | ReAct |
7. Function Calling Best Practices¶
7.1 Schema Design¶
# Good function definition
{
"name": "search_products",
"description": "Search for products in the catalog. Use when user asks about availability, price, or features.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query, e.g., 'wireless headphones under $100'"
},
"category": {
"type": "string",
"enum": ["electronics", "clothing", "home"],
"description": "Product category to narrow search"
},
"max_price": {
"type": "number",
"description": "Maximum price in USD"
}
},
"required": ["query"]
}
}
7.2 Best Practices Summary¶
| Practice | Description |
|---|---|
| Clear names | Use descriptive function names |
| Detailed descriptions | Explain what, when, and how |
| Type constraints | Use enums for limited options |
| Required params | Mark essential parameters |
| Examples in descriptions | Help LLM understand usage |
| Error handling | Plan for failures |
7.3 Common Pitfalls¶
- Vague descriptions → Wrong function selected
- Missing required params → Incomplete calls
- Too many functions → Selection confusion
- No error handling → Silent failures
- Overly complex schemas → Parameter errors
8. Tool Selection & Routing¶
8.1 Tool Selection Strategies¶
| Strategy | Description | Use Case |
|---|---|---|
| LLM-based | Model decides from descriptions | Dynamic tool sets |
| Semantic search | Embed query, find similar tools | Large tool catalogs |
| Rule-based | Keywords/patterns → tools | Deterministic routing |
| Hybrid | Combine approaches | Production systems |
8.2 Tool Routing Architecture¶
graph LR
Q["Query"] --> IC["Intent Classification<br/>(Semantic Match)"]
IC --> TS["Tool Selection<br/>(Rank by relevance)"]
TS --> EX["Execute"]
style IC fill:#e8eaf6,stroke:#3f51b5
style TS fill:#fff3e0,stroke:#ef6c00
style EX fill:#e8f5e9,stroke:#4caf50
9. Interview Questions¶
9.1 Concept Questions¶
Q: What is function calling in LLMs?
A: Function calling lets LLMs respond with structured JSON specifying
function names and parameters, enabling interaction with external
systems like APIs, databases, and tools.
Q: Explain the ReAct paradigm.
A: ReAct (Reasoning + Acting) alternates between:
- Thought: Chain-of-thought reasoning
- Action: External tool/environment interaction
- Observation: Result from action
Benefits: Better grounding, interpretable traces, error recovery
Q: When would you use agents vs direct LLM calls?
A: Use agents when:
- Task requires multiple steps
- Context changes dynamically
- Self-reflection/iteration needed
- External tools must be orchestrated
Use direct calls when:
- Single-shot tasks
- Context is static
- Speed is critical
9.2 Architecture Questions¶
Q: Design a multi-tool agent system.
A: Architecture:
1. Query Understanding → Intent classification
2. Tool Selection → Semantic search + LLM ranking
3. Tool Execution → Parallel where possible
4. Result Aggregation → Summarization LLM
5. Response Generation → User-facing output
Key considerations:
- Tool schema management
- Error handling & fallbacks
- Rate limiting
- Cost tracking
Q: Compare LangChain vs AutoGen for multi-agent systems.
A: LangChain:
- Better for tool orchestration
- LangGraph for state machines
- More production-ready
AutoGen:
- Native multi-agent conversations
- Built-in human-in-the-loop
- Better for research/enterprise
9.3 Implementation Questions¶
Q: How do you handle tool call failures?
A: Strategies:
1. Retry with modified parameters
2. Fallback to alternative tools
3. Ask user for clarification
4. Return partial results with error info
5. Log for improvement
Implementation:
- Timeout handling
- Rate limit backoff
- Circuit breaker pattern
Q: Optimize LLM agent latency.
A: Techniques:
1. Plan reuse (AgentReuse: 93% latency reduction)
2. Tool call caching
3. Parallel tool execution
4. Smaller models for routing
5. Streaming responses
6. Speculative execution
10. Key Papers & Resources¶
| Paper/Resource | Year | Key Contribution |
|---|---|---|
| ReAct | 2022 | Reasoning + Acting paradigm |
| Small LLMs Are Weak Tool Learners | 2024 | Planner-Caller-Summarizer |
| AgentGuard | 2025 | Safety evaluation framework |
| STRIDE | 2025 | When to use agentic AI |
| AgentReuse | 2025 | 93% latency reduction |
11. Formulas¶
Agentic Suitability Score (STRIDE)¶
Plan Reuse Speedup¶
Where \(\alpha\) = cache hit rate
Tool Selection Score¶
12. Sources & Links¶
- ReAct Paper - ArXiv 2210.03629
- OpenAI Function Calling Guide
- Small LLMs Are Weak Tool Learners - ArXiv 2401.07324
- STRIDE Framework - ArXiv 2512.02228
- Tool-Calling vs ReAct Agents - Medium
- LangChain vs AutoGen vs CrewAI - LinkedIn
- Function Calling Guide - PromptingGuide.ai
- 50 AI & LLM Engineer Interview Questions
See Also¶
- MCP vs Function Calling -- MCP protocol vs native function calling
- LLM Agents -- agent pipeline, plan reuse, security
- Prompt Engineering -- CoT, ReAct, ToT prompting patterns
- Multi-Agent Orchestration -- coordination patterns
- Agent Frameworks -- LangGraph, CrewAI, AutoGen