Интервью в OpenAI и Anthropic: процесс и подготовка¶
~9 минут чтения
Предварительно: Методы alignment | Подготовка к кодинг-интервью
OpenAI и Anthropic -- два ведущих AI-лаборатории с acceptance rate <1% каждая. OpenAI оценивает ML depth на 40% (214 вопросов на Glassdoor, 2026), Anthropic ставит mission alignment на 30% (145 вопросов). Ключевое различие: OpenAI -- capability-first (AGI acceleration), Anthropic -- safety-first (Constitutional AI). Компенсация L5: $500-700K (OpenAI) и $400-600K (Anthropic). Процесс: 4-8 недель, 5-6 раундов onsite.
Источники: Glassdoor, InterviewQuery, Exponent, Harvard FAS, AI blogs
Обзор¶
| Company | Acceptance Rate | Process Duration | Key Focus |
|---|---|---|---|
| OpenAI | <1% | 4-8 weeks | ML depth + engineering |
| Anthropic | <1% | 4-6 weeks | Mission fit + AI safety |
1. OpenAI Interview Process¶
Timeline (6 stages)¶
graph LR
A[Recruiter Screen<br/>30 min] --> B[Technical Phone<br/>60 min x2]
B --> C[Take-home<br/>optional]
C --> D[Onsite Loop<br/>5-6 interviews]
D --> E[Hiring Committee]
E --> F[Offer]
style A fill:#e8eaf6,stroke:#3f51b5
style B fill:#fff3e0,stroke:#ef6c00
style C fill:#fff3e0,stroke:#ef6c00
style D fill:#fce4ec,stroke:#c62828
style E fill:#f3e5f5,stroke:#9c27b0
style F fill:#e8f5e9,stroke:#4caf50
Stage Details¶
1. Recruiter Screen¶
- Mission alignment: "Why OpenAI?"
- Background review
- Role fit discussion
- Salary expectations
2. Technical Phone Screen¶
Format: 2 rounds, 60 min each - Round 1: Coding (LeetCode Medium-Hard) - Round 2: ML System Design
Sample questions: - "Design a low-latency inference system for GPT" - "Implement attention mechanism from scratch" - "Optimize model for edge deployment"
3. Onsite Loop (Full Day)¶
| Interview | Focus | Duration |
|---|---|---|
| ML Deep Dive #1 | ML fundamentals | 60 min |
| ML Deep Dive #2 | Advanced ML | 60 min |
| CS Fundamentals | Algorithms/Systems | 60 min |
| System Design | Architecture | 60 min |
| Behavioral | Values/Culture | 45 min |
| Research Talk (if applicable) | Presentation | 45 min |
What OpenAI Evaluates¶
| Dimension | Weight | Focus |
|---|---|---|
| ML Depth | 40% | Understanding, not memorization |
| Engineering | 30% | Clean code, system design |
| Research | 20% | Publications, novel thinking |
| Mission Fit | 10% | AGI alignment, safety awareness |
OpenAI-Specific Questions¶
ML Questions: 1. "Explain how attention works. Implement it." 2. "Design training pipeline for a 100B parameter model" 3. "How would you improve GPT-4's reasoning?" 4. "Explain RLHF vs DPO vs Constitutional AI" 5. "Design a system to evaluate LLM safety"
System Design: 1. "Design ChatGPT at 1M QPS" 2. "Design a real-time model serving infrastructure" 3. "Design data pipeline for training next-gen model"
Behavioral: 1. "Why is AI safety important to you?" 2. "Describe a time you disagreed with research direction" 3. "How do you stay current with AI research?"
2. Anthropic Interview Process¶
Timeline (6 stages)¶
graph LR
A[Recruiter<br/>30 min] --> B[Take-home<br/>3-5 days]
B --> C[Live Coding<br/>60 min]
C --> D[Panel Interview<br/>3 interviewers]
D --> E[Final Onsite<br/>3-4 interviews]
E --> F[Offer]
style A fill:#e8eaf6,stroke:#3f51b5
style B fill:#fff3e0,stroke:#ef6c00
style C fill:#fff3e0,stroke:#ef6c00
style D fill:#fce4ec,stroke:#c62828
style E fill:#f3e5f5,stroke:#9c27b0
style F fill:#e8f5e9,stroke:#4caf50
Stage Details¶
1. Recruiter Conversation¶
- Mission alignment: "Why safe AI?"
- Background overview
- Role-specific questions
2. Technical Take-home¶
Typical tasks: - Implement a transformer component - Analyze model behavior - Write a research proposal
Time: 3-5 days
3. Live Coding¶
- ML implementation
- Algorithm questions
- Code review exercise
4. Panel Interview (Work Simulation)¶
Format: 3 interviewers, collaborative problem-solving
Example scenarios: - "We need to detect harmful outputs. Design the system." - "Improve Claude's reasoning on math problems."
5. Final Onsite¶
| Interview | Focus |
|---|---|
| Technical Deep Dive | ML + Systems |
| Values Interview | Mission fit |
| Research Discussion | Technical depth |
| Hiring Manager | Team fit |
What Anthropic Evaluates¶
| Dimension | Weight | Focus |
|---|---|---|
| Mission Alignment | 30% | AI safety commitment |
| Technical Depth | 35% | ML understanding |
| Research Ability | 20% | Novel contributions |
| Collaboration | 15% | Team work style |
Anthropic-Specific Questions¶
Technical: 1. "How does Constitutional AI work?" 2. "Design a system to detect and mitigate harmful outputs" 3. "Explain the training process for Claude" 4. "How would you evaluate model alignment?" 5. "Implement a simple language model from scratch"
Values/Mission: 1. "Why is AI safety important?" 2. "What are the risks of advanced AI?" 3. "How should companies approach AGI development?" 4. "Describe your approach to responsible AI"
3. Comparison: OpenAI vs Anthropic¶
| Aspect | OpenAI | Anthropic |
|---|---|---|
| Primary focus | AGI capability | AI safety |
| Technical depth | Higher | High |
| Mission questions | Moderate | Heavy |
| Research component | Strong | Very strong |
| Coding focus | High | Moderate |
| Timeline | 4-8 weeks | 4-6 weeks |
Culture Differences¶
| OpenAI | Anthropic |
|---|---|
| Product-driven | Research-driven |
| AGI acceleration focus | Safety-first approach |
| Commercial applications | Academic partnerships |
| Rapid iteration | Careful deployment |
4. Common Technical Topics¶
ML Fundamentals (Both Companies)¶
| Topic | Expectation |
|---|---|
| Transformers | Implement from scratch |
| Attention | All variants (MHA, MQA, GQA) |
| Tokenization | BPE, Unigram |
| Training | Loss functions, optimization |
| Fine-tuning | LoRA, QLoRA, RLHF |
| Evaluation | Benchmarks, metrics |
System Design Topics¶
| Topic | Key Considerations |
|---|---|
| Model serving | Latency, throughput, cost |
| Training infrastructure | GPU allocation, FSDP/DeepSpeed |
| Data pipelines | Scale, quality, filtering |
| Evaluation systems | Automated vs human |
Coding Expectations¶
| Company | LeetCode Level | Focus |
|---|---|---|
| OpenAI | Medium-Hard | ML-focused implementation |
| Anthropic | Medium | Clean code, correctness |
5. Preparation Strategy¶
12-Week Roadmap¶
Weeks 1-4: Fundamentals - Transformers paper (Attention is All You Need) - Implement attention, BPE, backprop - LeetCode 50+ problems
Weeks 5-8: Advanced ML - RLHF, DPO, Constitutional AI papers - Training infrastructure (FSDP, DeepSpeed) - Model evaluation techniques
Weeks 9-10: System Design - LLM serving architecture - Training pipeline design - Data infrastructure
Weeks 11-12: Practice - Mock interviews (3-5 sessions) - Research talk preparation - Behavioral stories (STAR format)
Resources¶
| Resource | Purpose |
|---|---|
| LeetCode | Coding practice |
| InterviewQuery | ML questions |
| System Design Handbook | Architecture |
| Company blogs | Recent developments |
| arXiv | Latest papers |
6. Behavioral Preparation¶
STAR Format Stories¶
Prepare 5-7 stories covering:
| Theme | Example Questions |
|---|---|
| Technical Challenge | "Hardest bug you've debugged?" |
| Leadership | "Led a team through uncertainty?" |
| Failure | "Project that didn't go as planned?" |
| Conflict | "Disagreement with colleague?" |
| Innovation | "Novel solution you proposed?" |
| AI Ethics | "Time you raised safety concerns?" |
Company-Specific Values¶
OpenAI Values: - AGI should benefit humanity - Safety alongside capability - Open collaboration (historically) - Long-term thinking
Anthropic Values: - AI safety is paramount - Transparent research - Careful deployment - Responsible scaling
7. Salary & Offers (2025-2026)¶
Compensation Ranges¶
| Level | OpenAI | Anthropic |
|---|---|---|
| L4 (Mid) | $350-450K TC | $300-400K TC |
| L5 (Senior) | $500-700K TC | $400-600K TC |
| L6 (Staff) | $700K-1M+ TC | $600-900K TC |
TC = Total Compensation (base + equity + bonus)
Equity Considerations¶
| Company | Equity Type | Liquidity |
|---|---|---|
| OpenAI | Profit Participation Units | Tender offers |
| Anthropic | Stock Options | Secondary markets |
8. Red Flags & Tips¶
Red Flags to Avoid¶
- Memorized answers without understanding
- Dismissing AI safety concerns
- Poor code quality in interviews
- Lack of research awareness
- Misaligned mission motivation
Tips for Success¶
- Know the papers — Read recent work from each company
- Practice implementation — Not just theory
- Show safety awareness — Both companies value this
- Ask good questions — Show genuine interest
- Be honest — Don't fake knowledge
9. Interview Questions Database¶
OpenAI Technical Questions¶
Coding:
# Implement scaled dot-product attention
def attention(Q, K, V, mask=None):
d_k = Q.size(-1)
scores = torch.matmul(Q, K.transpose(-2, -1)) / math.sqrt(d_k)
if mask is not None:
scores = scores.masked_fill(mask == 0, -1e9)
return torch.matmul(F.softmax(scores, dim=-1), V)
ML: 1. "Derive the gradient of softmax cross-entropy" 2. "Explain why LayerNorm is used over BatchNorm in transformers" 3. "How does LoRA reduce memory during fine-tuning?"
Anthropic Technical Questions¶
Coding:
# Implement a simple tokenizer
def tokenize(text, vocab):
tokens = []
for word in text.split():
if word in vocab:
tokens.append(vocab[word])
else:
# Handle OOV
tokens.extend(subword_tokenize(word, vocab))
return tokens
Safety-focused: 1. "How would you detect if a model is being deceptive?" 2. "Design a red-teaming pipeline for LLM evaluation" 3. "What metrics would you use to measure alignment?"
Cross-references¶
Related source files for topic coverage: - Attention/Transformers: mqa-gqa-внимание - RLHF/DPO: прогресс-rlhf - Constitutional AI: конституционный-ai - System Design: паттерны-ml-system-design - Model Serving: сравнение-движков-инференса - AI Safety: безопасность-ai-alignment - LLM Security: безопасность-owasp-llm
Related synthesis cheatsheets: - Master Study Guide (interview strategy, top questions) - Alignment & RLHF Cheatsheet (DPO, Constitutional AI) - LLM Inference Cheatsheet (serving, optimization) - ML System Design Cheatsheet (design patterns)
Заблуждение: OpenAI и Anthropic ищут одинаковых кандидатов
OpenAI ставит ML depth на 40% и coding на High (LeetCode Medium-Hard). Anthropic ставит mission alignment на 30% и coding на Moderate (LeetCode Medium). В Anthropic take-home обязателен (3-5 дней), в OpenAI -- опционален. Подготовка должна быть company-specific.
Заблуждение: AI safety -- это soft-skill вопрос, можно импровизировать
В Anthropic mission alignment -- 30% оценки. Нужно знать Constitutional AI, RLHF vs DPO, red-teaming, конкретные risk scenarios. В OpenAI safety awareness -- 10%, но "dismissing AI safety concerns" -- явный red flag. Подготовьте 2-3 конкретных примера safety-проблем с техническими решениями.
Заблуждение: достаточно знать архитектуру трансформеров на уровне теории
Обе компании требуют implement from scratch: attention mechanism, BPE tokenizer, backpropagation. OpenAI спросит "Implement attention mechanism from scratch" и "Design training pipeline for 100B model". Без практики написания кода руками -- провал.
Interview Questions¶
Q: Объясните RLHF vs DPO vs Constitutional AI -- когда что использовать?
Red flag: "RLHF -- это когда люди оценивают ответы модели, DPO -- то же самое но проще, Constitutional AI -- подход Anthropic."
Strong answer: "RLHF: reward model обучается на human preferences, затем PPO оптимизирует policy -- дорого (отдельная reward model + PPO training), нестабильно. DPO: прямая оптимизация policy без reward model, implicit reward через log-ratio -- проще, стабильнее, но менее гибко. Constitutional AI (Anthropic): модель сама критикует и ревизирует ответы по набору principles, затем RLAIF -- масштабируется без людей. Выбор: DPO для быстрого alignment, RLHF для сложных reward signals, Constitutional AI для масштабируемого self-improvement."
Q: Спроектируйте систему обнаружения harmful outputs для LLM в production.
Red flag: "Поставим classifier перед выходом модели, который фильтрует плохие ответы."
Strong answer: "Multi-layer defence: (1) Input filters -- classifier на prompt injection, jailbreak detection. (2) Guardrails -- rule-based + ML classifiers на категории вреда. (3) Output scoring -- отдельная модель оценивает safety score. (4) Constitutional self-check -- модель проверяет свой ответ. (5) Human-in-the-loop -- sampling для мониторинга. Метрики: false positive rate (UX), false negative rate (safety), latency overhead (<100ms). A/B testing с red-team evaluation."
Q: Как бы вы оптимизировали inference ChatGPT на 1M QPS?
Red flag: "Добавить больше GPU и использовать батчинг."
Strong answer: "Три уровня: (1) Model-level -- KV-cache optimization, speculative decoding (draft model предсказывает несколько токенов), quantization (INT8/FP8), continuous batching. (2) System-level -- model parallelism (tensor + pipeline), load balancing по GPU, request routing по длине. (3) Infrastructure -- geo-distributed serving, CDN для static, auto-scaling по QPS. Key trade-offs: latency vs throughput (continuous batching), quality vs speed (quantization), cost vs redundancy (replicas). Target: p99 latency <2s, throughput 1M QPS при 99.9% availability."
10. Sources & Further Reading¶
Official Resources¶
- OpenAI Careers: https://openai.com/careers
- Anthropic Careers: https://www.anthropic.com/careers
Interview Guides¶
- InterviewQuery OpenAI Guide
- InterviewQuery Anthropic Guide
- Exponent AI Interview Course
- Harvard FAS "How to Ace OpenAI Interviews"
Glassdoor Data¶
- OpenAI: 214 interview questions (2026)
- Anthropic: 145 interview questions (2026)
Blogs¶
- "Ace Your OpenAI ML Interview: Top 25 Questions" (Medium, 2026)
- "Anthropic Interview Experience" (TryExponent, 2025)
- "Ultimate AI Research Engineer Guide" (Sundeep Teki)