Skip to main content

Metrics and Success Measurement

This section covers metric frameworks, selection criteria, and common interview questions about product measurement.

Purpose of Product Metrics

PurposeDescription
AlignmentEstablish shared definition of success
Decision-makingReplace opinion-based debates with data
AccountabilityMeasure outcomes, not just output

Metric Frameworks

AARRR (Pirate Metrics)

User journey-based framework for consumer products with clear funnels.

StageDefinitionExample Metrics
AcquisitionHow users find the productSignups, installs, traffic sources
ActivationFirst value experienceOnboarding completion, first action
RetentionReturn usageDAU/MAU, cohort retention
ReferralUser-driven growthInvite rate, viral coefficient
RevenueMonetizationARPU, conversion rate, LTV

Applicability: Consumer apps, e-commerce, freemium SaaS

Limitations: Does not map well to B2B with sales-driven acquisition or enterprise products with long sales cycles.

HEART Framework

Google-developed framework for user experience measurement.

DimensionDefinitionSignal ExamplesMetric Examples
HappinessUser satisfactionSurvey responsesNPS, CSAT, satisfaction rating
EngagementUsage intensitySessions, actionsDAU, time in app, actions per session
AdoptionNew user/feature uptakeFeature discoveryFeature adoption rate, new user activation
RetentionContinued usageReturn visitsD7/D30 retention, churn rate
Task SuccessGoal completionTask attemptsCompletion rate, error rate, time to complete

GSM approach: For each dimension, define Goal, identify Signal, select Metric.

Applicability: Internal tools, developer platforms, B2B products where AARRR does not fit naturally.

North Star Metrics

Single metric capturing core value delivery.

CompanyNorth StarRationale
AirbnbNights bookedCaptures both host and guest value
SpotifyTime spent listeningDirect measure of user value
SlackMessages sent by DAUMeasures actual usage, not just logins
UberRides completedEnd-to-end success metric

North Star criteria:

  • Directly tied to customer value
  • Leading indicator for revenue
  • Difficult to game without improving product

Poor North Star examples:

  • "Total users" (includes churned users)
  • "Page views" (does not indicate value delivery)

Metric Categories

Leading vs Lagging Indicators

TypeDefinitionExamplesUse Case
LaggingOutcome metrics measured after the factRevenue, churn, market shareEvaluate results
LeadingPredictive metrics indicating future outcomesActivation rate, weekly active usage, NPSEarly warning, iteration

Recommendation: Primary metric should be leading indicator; validate with lagging indicator.

Counter Metrics (Guardrails)

Every primary metric requires a guardrail to prevent gaming.

Primary MetricGuardrailRationale
Notification CTRUnsubscribe ratePrevent spammy tactics
Session lengthTask completionPrevent confusion-driven time
Signup rateActivation ratePrevent low-quality signups
RevenueRefund ratePrevent aggressive sales

Metric Trees

Decompose high-level metrics to identify specific drivers.

Loading diagram...
Loading diagram...

Use metric trees to diagnose why a metric changed and identify actionable levers.

Interview Question Types

"How would you measure success for X?"

Response structure:

  1. State feature/product goal
  2. Define success criteria
  3. Identify primary metric with rationale
  4. Add secondary metrics for context
  5. Specify guardrail metrics

Example: "How would you measure success for Instagram collaborative stories?"

ComponentResponse
GoalIncrease engagement, strengthen social connections
Primary metricCollaborative stories created per week
Secondary metricsCollaborators per story, views vs. regular stories, repeat usage
GuardrailRegular story creation (ensure no cannibalization)

"This metric dropped 10%. What do you do?"

Structured diagnosis approach:

StepActionPurpose
1. Validate dataCheck tracking, pipeline, definition changesEliminate false positives
2. Assess scopeAll users or specific segment?Narrow investigation
3. Check timingWhat changed when the drop started?Identify correlation
4. Segment analysisPlatform, geography, user typeIsolate root cause
5. Form hypothesesInternal vs external factorsPrioritize investigation

"Should we optimize for DAU or MAU?"

Selection criteria:

MetricUse When
DAUProduct designed for daily use (social media, messaging, news)
MAUProduct used episodically (travel booking, job hunting, tax software)
WAUProduct with weekly natural cadence (planning tools, grocery apps)

Examples:

  • Instagram: DAU (designed for daily engagement)
  • Airbnb: MAU (travel is episodic)
  • Pinterest: WAU (planning and inspiration use case)

Common Metric Errors

ErrorDescriptionImpact
Measuring accessibilityTracking what is easy to measure, not what mattersFalse confidence in wrong direction
Metric overload50+ metrics on dashboardNo actionable insights
No guardrailsOptimizing one metric while damaging othersUnintended negative consequences
Ignoring segmentsUsing averages that hide important patternsMissing root causes
Vanity metrics"10M registered users" with no activity contextMisleading stakeholders

Metric Manipulation Awareness

TechniqueExampleDetection
Cherry-picking timeframesCompare to worst historical monthRequest consistent comparison periods
Survivorship bias"Retained users love it"Request full cohort analysis
Definition changesRedefine "active" to easier thresholdTrack definition changes over time
Favorable segmentationReport only power user metricsRequest full user breakdown

Company Metric Approaches

CompanyApproachKey Insight
FacebookShifted from engagement to "meaningful social interactions"Metrics should evolve with understanding of user impact
Airbnb"Nights booked" captures both marketplace sidesGood North Star balances supply and demand
SpotifyBalanced playlist metrics with artist ecosystem healthMetrics without context can mislead
PinterestShifted from MAU to "weekly active pinners"Match metrics to natural product cadence
AmazonInput metrics (selection, price, convenience)Control inputs to drive outputs

Metric Selection Criteria

CriterionQuestion
ActionableCan we influence this metric?
AlignedDoes it reflect business and user goals?
UnderstandableCan stakeholders interpret it?
TimelyIs feedback loop fast enough?
Resistant to gamingDoes improvement require real value delivery?