Metrics and Success Measurement
This section covers metric frameworks, selection criteria, and common interview questions about product measurement.
Purpose of Product Metrics
| Purpose | Description |
|---|---|
| Alignment | Establish shared definition of success |
| Decision-making | Replace opinion-based debates with data |
| Accountability | Measure outcomes, not just output |
Metric Frameworks
AARRR (Pirate Metrics)
User journey-based framework for consumer products with clear funnels.
| Stage | Definition | Example Metrics |
|---|---|---|
| Acquisition | How users find the product | Signups, installs, traffic sources |
| Activation | First value experience | Onboarding completion, first action |
| Retention | Return usage | DAU/MAU, cohort retention |
| Referral | User-driven growth | Invite rate, viral coefficient |
| Revenue | Monetization | ARPU, conversion rate, LTV |
Applicability: Consumer apps, e-commerce, freemium SaaS
Limitations: Does not map well to B2B with sales-driven acquisition or enterprise products with long sales cycles.
HEART Framework
Google-developed framework for user experience measurement.
| Dimension | Definition | Signal Examples | Metric Examples |
|---|---|---|---|
| Happiness | User satisfaction | Survey responses | NPS, CSAT, satisfaction rating |
| Engagement | Usage intensity | Sessions, actions | DAU, time in app, actions per session |
| Adoption | New user/feature uptake | Feature discovery | Feature adoption rate, new user activation |
| Retention | Continued usage | Return visits | D7/D30 retention, churn rate |
| Task Success | Goal completion | Task attempts | Completion rate, error rate, time to complete |
GSM approach: For each dimension, define Goal, identify Signal, select Metric.
Applicability: Internal tools, developer platforms, B2B products where AARRR does not fit naturally.
North Star Metrics
Single metric capturing core value delivery.
| Company | North Star | Rationale |
|---|---|---|
| Airbnb | Nights booked | Captures both host and guest value |
| Spotify | Time spent listening | Direct measure of user value |
| Slack | Messages sent by DAU | Measures actual usage, not just logins |
| Uber | Rides completed | End-to-end success metric |
North Star criteria:
- Directly tied to customer value
- Leading indicator for revenue
- Difficult to game without improving product
Poor North Star examples:
- "Total users" (includes churned users)
- "Page views" (does not indicate value delivery)
Metric Categories
Leading vs Lagging Indicators
| Type | Definition | Examples | Use Case |
|---|---|---|---|
| Lagging | Outcome metrics measured after the fact | Revenue, churn, market share | Evaluate results |
| Leading | Predictive metrics indicating future outcomes | Activation rate, weekly active usage, NPS | Early warning, iteration |
Recommendation: Primary metric should be leading indicator; validate with lagging indicator.
Counter Metrics (Guardrails)
Every primary metric requires a guardrail to prevent gaming.
| Primary Metric | Guardrail | Rationale |
|---|---|---|
| Notification CTR | Unsubscribe rate | Prevent spammy tactics |
| Session length | Task completion | Prevent confusion-driven time |
| Signup rate | Activation rate | Prevent low-quality signups |
| Revenue | Refund rate | Prevent aggressive sales |
Metric Trees
Decompose high-level metrics to identify specific drivers.
Use metric trees to diagnose why a metric changed and identify actionable levers.
Interview Question Types
"How would you measure success for X?"
Response structure:
- State feature/product goal
- Define success criteria
- Identify primary metric with rationale
- Add secondary metrics for context
- Specify guardrail metrics
Example: "How would you measure success for Instagram collaborative stories?"
| Component | Response |
|---|---|
| Goal | Increase engagement, strengthen social connections |
| Primary metric | Collaborative stories created per week |
| Secondary metrics | Collaborators per story, views vs. regular stories, repeat usage |
| Guardrail | Regular story creation (ensure no cannibalization) |
"This metric dropped 10%. What do you do?"
Structured diagnosis approach:
| Step | Action | Purpose |
|---|---|---|
| 1. Validate data | Check tracking, pipeline, definition changes | Eliminate false positives |
| 2. Assess scope | All users or specific segment? | Narrow investigation |
| 3. Check timing | What changed when the drop started? | Identify correlation |
| 4. Segment analysis | Platform, geography, user type | Isolate root cause |
| 5. Form hypotheses | Internal vs external factors | Prioritize investigation |
"Should we optimize for DAU or MAU?"
Selection criteria:
| Metric | Use When |
|---|---|
| DAU | Product designed for daily use (social media, messaging, news) |
| MAU | Product used episodically (travel booking, job hunting, tax software) |
| WAU | Product with weekly natural cadence (planning tools, grocery apps) |
Examples:
- Instagram: DAU (designed for daily engagement)
- Airbnb: MAU (travel is episodic)
- Pinterest: WAU (planning and inspiration use case)
Common Metric Errors
| Error | Description | Impact |
|---|---|---|
| Measuring accessibility | Tracking what is easy to measure, not what matters | False confidence in wrong direction |
| Metric overload | 50+ metrics on dashboard | No actionable insights |
| No guardrails | Optimizing one metric while damaging others | Unintended negative consequences |
| Ignoring segments | Using averages that hide important patterns | Missing root causes |
| Vanity metrics | "10M registered users" with no activity context | Misleading stakeholders |
Metric Manipulation Awareness
| Technique | Example | Detection |
|---|---|---|
| Cherry-picking timeframes | Compare to worst historical month | Request consistent comparison periods |
| Survivorship bias | "Retained users love it" | Request full cohort analysis |
| Definition changes | Redefine "active" to easier threshold | Track definition changes over time |
| Favorable segmentation | Report only power user metrics | Request full user breakdown |
Company Metric Approaches
| Company | Approach | Key Insight |
|---|---|---|
| Shifted from engagement to "meaningful social interactions" | Metrics should evolve with understanding of user impact | |
| Airbnb | "Nights booked" captures both marketplace sides | Good North Star balances supply and demand |
| Spotify | Balanced playlist metrics with artist ecosystem health | Metrics without context can mislead |
| Shifted from MAU to "weekly active pinners" | Match metrics to natural product cadence | |
| Amazon | Input metrics (selection, price, convenience) | Control inputs to drive outputs |
Metric Selection Criteria
| Criterion | Question |
|---|---|
| Actionable | Can we influence this metric? |
| Aligned | Does it reflect business and user goals? |
| Understandable | Can stakeholders interpret it? |
| Timely | Is feedback loop fast enough? |
| Resistant to gaming | Does improvement require real value delivery? |