ML System Design: Пробелы (Gaps)¶

~11 минут чтения

Что спрашивают на собеседованиях, чего НЕТ в 8 задачах Недопокрытые темы для AI/ML/LLM Engineer Обновлено: 2026-02-11

Текущее покрытие (8 задач)¶

Подкатегория	Задач	Покрытие
Model Serving	1	Хорошее
A/B Testing	1	Хорошее
Drift Detection	1	Хорошее
Model Calibration	1	Хорошее
Ranking Metrics	1	Хорошее
Trade-offs Quiz	1	Отличное (15 сценариев)
RecSys	1	Базовое
LLM Production	1	Хорошее

КРИТИЧЕСКИЕ GAPS¶

1. Feature Stores — ЧАСТИЧНО ЗАПОЛНЕНО¶

Добавлено в materials.md section 9: - Feature Store definition и why it matters - Architecture components (Offline vs Online Store, Ingestion, Registry) - Point-in-Time Correctness concept with SQL example - Feature Store comparison table (Feast vs Tecton vs SageMaker) - Python Feast example (Entity, FeatureView, get_online_features, get_historical_features) - Key concepts table (Feature View, Entity, TTL, Materialization) - Interview questions (4 Q&A)

Источники: Aerospike Blog (July 2025), Reintech.io Feature Store Comparison (Jan 2026)

Осталось: - Отдельная практическая задача (ContentBlock) - Hopsworks comparison - Streaming feature pipelines details

2. ML Infrastructure — ЧАСТИЧНО ЗАПОЛНЕНО¶

Добавлено в materials.md section 10: - Why ML Infrastructure matters (reproducibility, audit trail, compliance) - Core components table (Experiment Tracking, Model Registry, Data Versioning, Pipeline Orchestration) - Tool comparison (MLflow vs W&B vs DVC) - MLflow architecture diagram (Tracking, Projects, Models, Registry) - Python examples: experiment tracking with mlflow.start_run(), model registration, registry operations (transition_model_version_stage) - Model Registry workflow (Experiment → Staging → Production → Archived) - Decision framework table - Interview questions (4 Q&A)

Источники: ML Journey "Model Versioning Strategies" (Sep 2025), Conduktor "Real-Time ML Pipelines" (Feb 2026)

Осталось: - Отдельная практическая задача (ContentBlock) - Kubeflow, ZenML deep dive - CI/CD для ML specifics

3. Multi-Stage Recommender — ЧАСТИЧНО ЗАПОЛНЕНО¶

Добавлено в materials.md section 13: - Funnel Architecture diagram (Retrieval → Pre-ranking → Ranking → Re-ranking) - Two-Tower architecture with training code (in-batch negatives) - ANN Indexes (FAISS, ScaNN) comparison table and implementation - Ranking models (DIN, DCN, DeepFM, DCNv2) comparison - Re-ranking with MMR (Maximal Marginal Relevance) code - Production architecture (YouTube-scale) diagram - Interview questions (6 Q&A)

Источники: Shaped.ai (May 2025), Fan Luo Blog (Oct 2025), arXiv Allegro (Jul 2025), YouTube paper

Осталось: - Отдельная практическая задача (ContentBlock) - Graph-based retrieval (PinSage, LightGCN) - Real-time feature pipelines

СРЕДНИЕ GAPS¶

4. Online Learning — ЧАСТИЧНО ЗАПОЛНЕНО¶

Добавлено в materials.md section 12: - Online vs Batch Learning comparison table - Online Gradient Descent with Python code - Regret framework with formula - FTRL-Proximal for sparse high-dimensional features with Python implementation - Concept Drift Detection methods (ADWIN, DDM, EDDM, Page-Hinkley) - River library drift detection code example - Hoeffding Trees (VFDT) with Hoeffding bound formula - Flink ML Pipeline architecture and Java example - Production considerations (challenges, best practices) - Interview questions (5 Q&A)

Источники: ML Journey (Nov 2025), Conduktor (Feb 2026), Confluent Flink (Oct 2025), River docs

Осталось: - Отдельная практическая задача (ContentBlock) - Spark Streaming ML specifics - Deep online learning methods

5. Model Compression — ЧАСТИЧНО ЗАПОЛНЕНО¶

Добавлено в materials.md section 21: - Knowledge Distillation (temperature scaling, soft targets, distillation loss formula) - Distillation Loss with Python code (hard loss + soft loss, KL divergence) - Types of Distillation comparison (Response, Feature, Attention, Multi-Teacher) - Neural Network Pruning (Magnitude, Structured, Global, Iterative) - Lottery Ticket Hypothesis explanation - Pruning implementation with torch.nn.utils.prune - Iterative pruning pipeline with fine-tuning - Hybrid Compression Pipeline (Pruning → Quantization → Distillation) - Edge Deployment Optimization (TensorFlow Lite, ONNX export) - Pruning benchmarks (ResNet-50, 90% sparsity, 4.2x speedup) - Interview questions (4 Q&A)

Источники: LabelYourData "Knowledge Distillation" (2025), Arik Poz "PyTorch Distillation" (Apr 2025), Johal.in "Neural Network Pruning" (Nov 2025), Frontiers "Survey of Model Compression" (2025)

Осталось: - Отдельная практическая задача (ContentBlock) - Neural Architecture Search (NAS) deep dive - Low-rank factorization

6. Multi-Armed Bandits — ЧАСТИЧНО ЗАПОЛНЕНО¶

Добавлено в materials.md section 11: - Exploration-Exploitation Dilemma with regret formula R(T) - Algorithm Comparison table (ε-greedy, UCB, Thompson Sampling) - ε-Greedy implementation with Python code - UCB formula and Python implementation - Thompson Sampling (Bayesian) with Beta distribution Python code - Contextual Bandits (LinUCB) formula - When to Use Bandits vs A/B Tests comparison table - Production Use Cases: Netflix (3-tier architecture), Spotify AI DJ - Interview questions (4 Q&A)

Источники: Philipp Dubach "Bandits and Agents" (Jan 2026), Statsig "Thompson Sampling" (June 2025), Russo et al. tutorial

Осталось: - Отдельная практическая задача (ContentBlock) - LinUCB detailed implementation - Non-stationary bandits

7. Causal Inference — ЧАСТИЧНО ЗАПОЛНЕНО¶

Добавлено в materials.md section 14: - Treatment Effect definitions (ATE, ATT, ATC, CATE) table - Key Assumptions (SUTVA, Unconfoundedness, Overlap, Consistency) - Method 1: Propensity Score Matching with Python code - Method 2: Difference-in-Differences (DiD) with formula and assumptions - Method 3: Instrumental Variables (IV) requirements - Method 4: Uplift Modeling (S-Learner, T-Learner, X-Learner) comparison - When to Use Which Method decision table - Interview questions (4 Q&A)

Источники: ML Journey (July 2025), Medium GrabNGoInfo, DoWhy/EconML docs

Осталось: - Отдельная практическая задача (ContentBlock) - Regression Discontinuity Design (RDD) - DoWhy refutation methods deep dive

НОВЫЕ ТЕМЫ 2025-2026 (НЕТ)¶

8. Foundation Models in Production — ЧАСТИЧНО ЗАПОЛНЕНО¶

Добавлено в materials.md section 19: - Multi-tenant LLM serving architecture (Kubernetes, namespace isolation, ResourceQuotas) - Prompt Caching economics (OpenAI 50% auto, Anthropic 90% explicit, KV cache reuse) - Multi-layer caching implementation (L1: Exact, L2: Semantic, L3: Provider) - Token economics (cost per token, compression techniques, context window management) - Fallback strategies (Circuit Breaker, graceful degradation, retry with backoff) - Interview questions (4 Q&A)

Источники: Collabnix "Multi-Tenant LLM Platform" (Dec 2025), Medium "Prompt Caching" (Jan 2026)

Осталось: - Отдельная практическая задача (ContentBlock) - Diffusion models serving specifics - Multi-region deployment patterns

9. AI Agents in Production — ЧАСТИЧНО ЗАПОЛНЕНО¶

Добавлено в materials.md section 20: - Agent definition and production criteria (autonomy, data, consequences) - OWASP Top 10 for LLM Applications 2025 (Prompt Injection 73%, Sensitive Data Leakage, Excessive Agency) - Defence-in-Depth Architecture (6 layers: Input Sanitization → Injection Detection → Agent Execution → Tool Call Interception → Output Validation → Observability) - Tool Allowlist with Permission Gating (ToolGatekeeper class, schema validation, least privilege) - Human-in-the-Loop (HITL) patterns with LangGraph interrupt() for pause/resume - LLM-as-Judge Evaluation (JudgeVerdict model, few-shot prompting, chain-of-thought) - Production Best Practices (10 commandments from UiPath/n8n) - OpenTelemetry GenAI Semantic Conventions (gen_ai.system, gen_ai.usage tokens) - Interview questions (4 Q&A)

Источники: RandomCommits "AI Agents in Production" (Jan 2026), Vertesia "Defence-in-Depth" (2025), UiPath "10 Commandments" (2025), LinkedIn Iain Harper, n8n Blog (Dec 2025)

Осталось: - Отдельная практическая задача (ContentBlock) - Multi-agent orchestration deep dive - Agent memory systems

10. Vector Databases for ML — ЧАСТИЧНО ЗАПОЛНЕНО¶

Добавлено в materials.md section 15: - ANN Index Types comparison table (HNSW, IVF, IVF-PQ, Flat) - HNSW parameters (M, ef_construction, ef_search) with trade-offs - IVF parameters (nlist, nprobe) with trade-offs - Vector Database Comparison (Pinecone, Milvus, Weaviate, Qdrant, pgvector, Chroma) - Python examples (Qdrant, Milvus) - Hybrid Search (Vector + BM25) with RRF (Reciprocal Rank Fusion) code - Index Refresh Strategies table (Full rebuild, Incremental, Dual index) - Dual index pattern code - Interview questions (5 Q&A)

Источники: JishuLabs (2026), Markaicode (2025), Medium IVF/HNSW guide (Jan 2026), TowardsAI Hybrid Search (Jan 2026)

Осталось: - Отдельная практическая задача (ContentBlock) - GPU-accelerated indexes - Multi-tenancy patterns

Практические Gaps¶

11. Cost Optimization — ЧАСТИЧНО ЗАПОЛНЕНО¶

Добавлено в materials.md section 16: - Inference cost breakdown (GPU 60-70%, Memory 15-20%) - GPU utilization strategies (batching, dynamic batching with Triton) - Spot instance strategy with preemption detection code - Model right-sizing decision framework - Cost per prediction formula with calculator - Semantic caching for LLMs with Python implementation - Auto-scaling with Kubernetes HPA - Cost optimization decision matrix

Источники: Lambda Labs GPU Pricing (2026), Neptune.ai (Jan 2026), Runhouse (2025)

Осталось: - Отдельная практическая задача (ContentBlock) - Reserved instances vs spot optimization - Multi-cloud cost arbitrage

12. Multi-Model Serving — ЧАСТИЧНО ЗАПОЛНЕНО¶

Добавлено в materials.md section 17: - Why Multi-Model (different tasks, cost optimization, redundancy, A/B testing) - Routing Strategies Comparison table (Weighted, Latency-Based, Cost-Aware, Confidence-Based, Cascade, ML-Based) - Weighted Round-Robin with Python implementation - Confidence-Based Routing (Cascade) with code - Latency-Aware Routing with rolling averages and fairness band - Fallback Chain with Circuit Breaker pattern (CLOSED/OPEN/HALF_OPEN states) - A/B Testing Between Models with deterministic user assignment - Model Router Decision Matrix (scenario → strategy → why) - Interview questions (4 Q&A)

Источники: TrueFoundry "LLM Load Balancing" (2025), LogRocket "LLM Routing in Production" (2026), arXiv "Universal Model Routing" (2025)

Осталось: - Отдельная практическая задача (ContentBlock) - ML-based routing with learned policies - Multi-cloud model routing specifics

13. Data Quality for ML — ЧАСТИЧНО ЗАПОЛНЕНО¶

Добавлено в materials.md section 18: - Data Quality Dimensions (Accuracy, Completeness, Consistency, Timeliness, Relevance, Uniqueness, Validity) - Validation Types by Pipeline Stage (Ingestion, Preparation, Training, Production) - Schema Validation with Great Expectations (expectations, type checks, range checks, completeness) - TensorFlow Data Validation (TFDV) with drift detection - Schema Evolution Strategies (Additive, Backward/Forward Compatible, Dual-Write) - Data Lineage Tracking with Python implementation (LineageNode, DataLineageTracker) - Data Quality Tools Comparison (Great Expectations, TFDV, Deepchecks, Pandera, Evidently AI) - Interview questions (4 Q&A)

Источники: Uplatz "Data Validation and Quality in MLOps" (Nov 2025), ML Journey "Data Lineage Tracking" (Sep 2025), Gartner Data Quality Report (2025)

Осталось: - Отдельная практическая задача (ContentBlock) - Real-time streaming data validation - PII detection and anonymization

Underspecified Topics¶

14. Monitoring & Observability — ЧАСТИЧНО ЗАПОЛНЕНО¶

Добавлено в materials.md section 22: - Four Pillars of Observability (Metrics, Logs, Traces, Dashboards) - Key Metrics tables (Model Performance, Data Quality, System) - Prometheus instrumentation with Python code (Counter, Gauge, Histogram) - Prometheus configuration (prometheus.yml) - Alerting rules for ML (accuracy drop, latency spike, data drift, error rate) - Grafana dashboard design (hierarchy, PromQL queries) - Structured logging with structlog (JSON format, prediction logging) - Drift detection implementation (KS test, PSI score with Python code) - Multi-level alerting strategy (P0-P3 severity levels) - Monitoring stack comparison (Prometheus, Grafana, MLflow, Evidently AI) - Interview questions (4 Q&A)

Источники: ML Journey (Sep 2025), Johal.in "MLOps Monitoring" (Sep 2025), Diousoft "Model Monitoring & Logging" (2025), Grafana Labs "Observability Survey" (Mar 2025)

Осталось: - Отдельная практическая задача (ContentBlock) - Distributed tracing with Jaeger deep dive - SRE practices for ML (SLOs, SLIs, error budgets)

15. Security for ML — ЧАСТИЧНО ЗАПОЛНЕНО¶

Добавлено в materials.md section 23: - Attack Taxonomy table (Evasion, Poisoning, Extraction, Inversion, MIA) - Adversarial Attacks with FGSM/PGD formulas and Python code - Model Extraction Attacks with defense code (rate limiting, output perturbation) - Model Inversion Attacks (confidence score attacks, attribute inference) - Membership Inference Attacks (MIA) defense - Differential Privacy with DP-SGD implementation (gradient clipping, Gaussian noise) - Defense Summary table (attack → primary defense → secondary defense) - Multi-Layer Defense Architecture (5 layers: Data, Training, Access, Output, Monitoring) - Interview questions (4 Q&A)

Источники: SentinelOne "Model Inversion" (Jan 2026), YASH Technologies "Adversarial Defenses" (Jan 2026), ByteJournal "Model Security" (Feb 2025), NIST Adversarial ML Taxonomy (Mar 2025)

Осталось: - Отдельная практическая задача (ContentBlock) - Model watermarking deep dive - Federated learning security specifics

Cross-References Missing¶

Связи, которые стоит добавить:

mlsd_001_model_serving -> llm_006_quantization (inference optimization)
mlsd_002_ab_testing -> stat_015_ab_test_sample_size (statistics)
mlsd_003_drift_detection -> stat_012_hypothesis_testing (KS test)
mlsd_007_recsys -> dl_010_attention (two-tower with attention)
mlsd_008_llm_prod -> llm_012_prompt_injection (security)

Итоговый Coverage Assessment¶

ML System Design текущий coverage: ~95% для ML Engineer, ~85% для Senior+

materials.md имеет 23 секции: 1-8. Core ML System Design (Model Serving, A/B Testing, Drift Detection, Calibration, Ranking, RecSys, Trade-offs, LLM Production) 9-15. Critical Gaps (Feature Stores, ML Infrastructure, Bandits, Online Learning, Multi-Stage RecSys, Causal Inference, Vector DBs) 16-23. Production & Emerging (Cost Optimization, Multi-Model Serving, Data Quality, Foundation Models, AI Agents, Compression, Monitoring, Security)

Главные пробелы остались: 1. ~~Feature stores~~ ✅ Filled (Section 9) 2. ~~ML infrastructure/platform~~ ✅ Filled (Section 10) 3. ~~Multi-stage recommender systems~~ ✅ Filled (Section 13) 4. ~~Online learning~~ ✅ Filled (Section 12) 5. ~~Causal inference~~ ✅ Filled (Section 14)

Осталось для 100%: - Практические задачи (ContentBlock) для каждой темы - SRE practices (SLOs, SLIs, error budgets) - Distributed tracing deep dive - GPU-accelerated vector indexes - Multi-tenancy patterns

Что уже хорошо покрыто¶

Тема	Покрытие	Почему хорошо
A/B Testing	Good	Sample size, significance, pitfalls
Drift Detection	Good	PSI, KS test, code examples
Model Calibration	Good	Platt, Isotonic, Brier score
Trade-offs Quiz	Excellent	15 production scenarios
LLM Production	Good	Guardrails, OWASP Top 10

Обновлено: 2026-02-11

Gap	Сложность	Задача
ML Infrastructure	Hard	`mlsd_010_ml_infrastructure`
Multi-Armed Bandits	Medium	`mlsd_013_bandits`
Causal Inference	Hard	`mlsd_014_causal_inference`