Метрики системы ранжирования ленты новостей¶

~4 минуты чтения

Предварительно: Определение задачи, Компоненты

Facebook оптимизирует ленту по ~50 сигналам engagement одновременно, балансируя short-term metrics (clicks, likes) и long-term metrics (user retention, session quality). Наивная оптимизация clicks приводит к clickbait-инфляции и потере DAU через 2-3 месяца. Правильный подход: multi-objective optimization с guardrail метриками, где primary metric = meaningful social interactions (MSI), не просто engagement.

Иерархия метрик¶

Уровень	Метрики	Цели
Business	DAU, session time, revenue	DAU retention > 95% MoM
User experience	Content diversity, satisfaction surveys	NPS > 60
Engagement	Likes, comments, shares, watch time	MSI growth
Model	AUC, NDCG, calibration	NDCG@10 > 0.4
System	Latency, throughput	p99 < 200ms, 500K QPS

Business Metrics¶

Core Metrics¶

Метрика	Формула	Цель	Warning threshold
DAU / MAU ratio	daily_active / monthly_active	> 0.65	< 0.60
Session duration	avg(session_end - session_start)	33 min (FB)	< 28 min
Sessions per day	total_sessions / DAU	> 5	< 4
Revenue per user (ARPU)	total_revenue / MAU	Growing QoQ	-5% QoQ
Content creator retention	active_creators_m / active_creators_m-1	> 90%	< 85%

def compute_msi(interactions: list) -> dict:
    """
    Facebook's primary metric post-2018 (after misinformation crisis)
    """
    weights = {
        "comment": 5.0,         # Highest value (conversation)
        "share": 4.0,           # Distribution to network
        "reaction_love": 3.0,   # Emotional response
        "reaction_other": 2.0,
        "like": 1.0,            # Low effort
        "click": 0.5,           # Passive consumption
        "impression": 0.1,      # Scroll past
    }

    msi_score = sum(
        weights.get(i.type, 0) * i.count
        for i in interactions
    )

    return {
        "msi_score": msi_score,
        "msi_per_session": msi_score / len(set(i.session_id for i in interactions)),
        "comment_ratio": sum(1 for i in interactions if i.type == "comment") / len(interactions),
    }

Model Metrics¶

Ranking Quality¶

Метрика	Формула	Цель	Что измеряет
NDCG@K	Normalized Discounted Cumulative Gain	> 0.4	Ranking relevance
AUC (per action)	Per-action prediction AUC	> 0.75	Prediction quality
Calibration	pred_P(action) / actual_P(action)	0.95-1.05	Score accuracy
Multi-objective Pareto	Pareto front coverage	Growing	Balance of objectives

Per-Action Prediction¶

def compute_multi_action_metrics(y_true_dict, y_pred_dict):
    """Отдельные метрики для каждого типа действия"""
    results = {}
    actions = ["like", "comment", "share", "click", "watch_30s", "hide", "report"]

    for action in actions:
        y_true = y_true_dict[action]
        y_pred = y_pred_dict[action]

        results[action] = {
            "auc": roc_auc_score(y_true, y_pred),
            "logloss": log_loss(y_true, y_pred),
            "base_rate": y_true.mean(),
            "calibration": y_pred.mean() / y_true.mean(),
        }

    return results

# Target AUCs:
# like:      0.75 (common action, easier)
# comment:   0.70 (rarer, harder)
# share:     0.68 (rarest, hardest)
# click:     0.72
# hide:      0.65 (negative signal, sparse data)

Diversity & Freshness¶

Метрика	Описание	Цель
Content type diversity	Entropy of content types in feed	> 1.5 bits
Source diversity	% unique content creators in feed	> 30%
Freshness	Median age of shown content	< 4 hours
Echo chamber score	Viewpoint diversity in political content	> 0.6
Repeat content rate	% items seen before	< 5%

Online Metrics (A/B Test)¶

Primary Metrics¶

Метрика	MDE	Test duration
MSI per session	+0.5%	2 weeks
Session duration	+1%	1 week
DAU (28-day)	+0.1%	4 weeks

Guardrail Metrics¶

Метрика	Constraint	Rationale
Content diversity	No decrease > 5%	Prevent echo chambers
Misinformation exposure	No increase > 1%	Trust & safety
Clickbait ratio	No increase > 2%	Content quality
Creator impression equity	Gini < 0.7	Creator ecosystem health
Ad revenue	No decrease > 0.5%	Business viability
Negative actions (hide, report)	No increase > 3%	User satisfaction

System Metrics¶

Метрика	Цель
Feed generation p50	< 100ms
Feed generation p99	< 200ms
Throughput	500K QPS
Feature freshness	< 5 min
Model freshness	< 6 hours
Cache hit rate	> 80%

Заблуждение: оптимизировать click-through rate

CTR optimization -> clickbait. Facebook в 2018 сменила primary metric с engagement time на MSI (meaningful social interactions). Результат: session time упал на 5%, но DAU retention вырос, и долгосрочный engagement стабилизировался. На интервью упомянуть MSI -- сильный сигнал: вы думаете о long-term effects, не только о proxy metrics.

Заблуждение: одна модель предсказывает 'engagement'

Engagement -- не одна вещь. Like (low effort), comment (high effort), share (distribution), watch_time (passive consumption) -- разные действия с разным value. Production-системы предсказывают P(action) для каждого действия отдельно, затем combined score: \(\sum w_i \cdot P(\text{action}_i)\). Веса \(w_i\) -- product decision, не ML decision. Менять веса можно без retrain.

Секция для интервью¶

Вопрос: "Какие метрики для ленты новостей?"

Слабый ответ: "CTR и время на платформе."

Сильный ответ: "Multi-level. Business: DAU/MAU ratio (> 0.65), session duration, ARPU, creator retention. User experience: MSI (meaningful social interactions -- comments > shares > likes > clicks), content diversity (entropy > 1.5 bits), freshness (median < 4h). Model: per-action AUC (like 0.75, comment 0.70, share 0.68), NDCG@10, calibration per action. Guardrails: misinformation exposure, clickbait ratio, creator impression equity (Gini < 0.7), negative action rate. Ключевой insight: наивная оптимизация CTR -> clickbait -> потеря DAU через 2-3 месяца. Facebook перешла на MSI в 2018 именно поэтому. Правильный combined score: \(\sum w_i \cdot P(\text{action}_i)\) где веса -- product decision."