Design a Fraud Detection System

Concepts tested: Class imbalance, anomaly detection, rule-based vs ML hybrid, velocity features, real-time scoring, precision-recall trade-offs, cost-sensitive learning, model explainability

Problem Statement

Design a fraud detection system for payments, account takeover, or fake account prevention.

Clarification Questions

Question	Design Impact
Fraud type	Feature engineering approach
Available actions	Block, allow, review, step-up authentication
Latency requirement	Real-time vs batch architecture
Cost structure	Loss from fraud vs loss from false declines
Available data	Transaction history, device, behavioral, network

Problem Characteristics

Class Imbalance

Fraud rates are typically low:

Credit card fraud: 0.1% of transactions
Account takeover: 0.01% of logins
Fake accounts: 1-5% of signups

Implication: Standard accuracy metrics are misleading. A model predicting "not fraud" always achieves 99.9% accuracy while detecting zero fraud.

Adversarial Environment

Fraudsters actively adapt to detection methods:

Study detection patterns and develop workarounds
Share techniques in fraud communities
Use automation (bots, scripts)
Continuously evolve tactics

Error Costs

Decision	Error Impact
Block legitimate user	Lost revenue, negative experience, churn
Allow fraudster	Direct financial loss
Manual review backlog	Operational cost, delayed decisions

System Architecture

Real-Time Scoring Pipeline:

Feature Enrichment: Add derived features to the transaction
Rules Engine: Apply deterministic rules
ML Models: Run supervised, anomaly detection, and graph-based models
Output: Combined score with explanation

Decision Engine:

Score below 20: ALLOW the transaction
Score 20-70: Send to REVIEW queue for manual inspection
Score 70 or above: BLOCK the transaction

Async Pipeline (batch analysis):

Pattern Mining: Discover new fraud tactics from historical data
Network Analysis: Identify fraud rings through connection analysis
Label Feedback: Update models with confirmed fraud/non-fraud labels

Feature Engineering

Transaction Features

Feature Category	Examples
Transaction details	Amount, merchant category, payment method
Amount patterns	Average transaction size, deviation from typical
Velocity	Transactions in last 1h/24h/7d
Geographic	Distance from typical location, high-risk countries
Temporal	Time of day, day of week, holiday

User Behavior Features

Feature	Signal
Device fingerprint	New vs known device
Login patterns	Unusual login times, locations
Session behavior	Click patterns, navigation speed
Account age	New accounts are higher risk
Historical fraud	Past fraud/disputes on account

Aggregated Features

Velocity features are computed by aggregating transaction history over various time windows:

Feature	Description
Transaction count (1h/24h)	Number of transactions in the last 1 hour or 24 hours
Transaction amount (1h)	Total transaction amount in the last hour
Distinct merchants (24h)	Number of unique merchants in last 24 hours
Distinct devices (7d)	Number of unique devices used in last 7 days
Average transaction amount (30d)	Historical average transaction size
Amount vs average ratio	Current transaction amount divided by 30-day average

Network/Graph Features

Fraudsters often operate in networks.

Feature	Description
Shared device count	Number of users sharing same device
Shared address count	Number of accounts at same address
Shared payment method	Number of accounts using same card
Network cluster risk	Aggregate risk of connected entities

Model Approaches

1. Rule-Based System

A rule-based fraud detection system evaluates transactions against predefined conditions:

Rule	Points	Trigger Condition
High amount new account	+30	Account less than 7 days old AND transaction over $500
Unusual location	+20	Distance from typical location exceeds 1000 miles
High velocity	+25	More than 5 transactions in the last hour

The system returns a cumulative score and a list of triggered rule reasons.

Characteristic	Assessment
Interpretability	High
Deployment speed	Fast
Training data required	None
Generalization	Limited
Evasion resistance	Low

2. Supervised Learning

Train on labeled fraud examples.

Training approach:

Apply SMOTE (Synthetic Minority Over-sampling Technique) to handle class imbalance, generating synthetic fraud examples
Train a Gradient Boosting Classifier with regularization parameters (max_depth, min_samples_leaf) to prevent overfitting
Use predict_proba to get fraud probability scores for new transactions

Imbalance Handling Techniques:

Technique	Method
Oversampling (SMOTE)	Generate synthetic minority examples
Undersampling	Reduce majority class examples
Class weights	Penalize minority misclassification more heavily
Anomaly detection	Train on normal behavior only

3. Anomaly Detection

Detect unusual patterns without labeled fraud data.

Isolation Forest approach:

Train on normal (non-fraud) transactions only
Set contamination parameter to expected anomaly rate (e.g., 1%)
The model learns to isolate unusual patterns
Score new transactions: negative scores indicate anomalies (more isolated points)

Method	Application
Isolation Forest	General anomaly detection
Autoencoders	High-dimensional data
DBSCAN clustering	Finding fraud clusters
Z-score	Simple threshold-based detection

4. Ensemble Approach

Combine multiple signals for robust detection.

Scoring process:

Evaluate rule-based score and collect triggered reasons
Get ML model probability and scale to 0-100
Get anomaly score from Isolation Forest and normalize to 0-100
Combine with weighted average: 30% rule score + 50% ML score + 20% anomaly score
Return final score along with component scores and reasons for explainability

Evaluation Metrics

Classification Metrics

Metric	Definition	Business Meaning
Precision	TP / (TP + FP)	Percentage of blocked transactions that were fraud
Recall	TP / (TP + FN)	Percentage of fraud that was caught
False Positive Rate	FP / (FP + TN)	Percentage of legitimate users blocked

Business Metrics

Metric	Formula
Loss prevented	Fraud amount blocked * recall
False decline cost	Legitimate blocked * avg value * margin
Review cost	Transactions reviewed * cost per review
Net benefit	Loss prevented - false decline cost - review cost

Threshold Optimization

Cost-based threshold selection:

Define costs for each outcome: fraud loss (e.g., $100), false decline (e.g., $5), and review cost (e.g., $2).

For each candidate threshold:

Calculate true positives (fraud correctly blocked), false positives (legitimate blocked), and false negatives (fraud missed)
Compute net benefit: (fraud prevented x fraud_loss) - (missed fraud x fraud_loss) - (false declines x false_decline_cost)
Select the threshold that maximizes net benefit

This approach optimizes for business outcomes rather than statistical metrics.

Real-Time Considerations

Latency Requirements

Use Case	Latency Requirement
Payment authorization	< 100ms
Account login	< 200ms
Account creation	< 500ms

Feature Store Architecture

Feature Type	Update Frequency	Examples
Batch features	Hourly/daily	30-day average, historical fraud rate
Streaming features	Real-time	Transaction count last 1h, distinct merchants 24h
Request features	At request time	Distance from last transaction, device match

Model Lifecycle

Label Collection

Source	Delay	Accuracy
Chargebacks	30-90 days	High
Customer reports	1-7 days	Medium
Manual review	Hours-days	High
Rule triggers	Immediate	Low-medium

Retraining Triggers

Performance degradation detected
New fraud pattern identified
Scheduled interval (weekly/monthly)

Explainability

Fraud decisions require explanation for manual review, customer disputes, and regulatory compliance.

SHAP (SHapley Additive exPlanations) approach:

Create a TreeExplainer for the trained model
Calculate SHAP values for each feature in the transaction
Identify top contributing features (e.g., transaction_velocity_1h contributed 25% to the score, new_device contributed 20%)
Present explanations alongside the fraud score for transparency

Summary

Decision	Options	Recommendation
Approach	Rules, ML, Anomaly, Hybrid	Hybrid (rules + ML + anomaly)
Class imbalance	SMOTE, weights, undersampling	Combination (SMOTE + class weights)
Model	Logistic, GBM, Neural	GBM (interpretable, handles tabular)
Real-time features	Pre-compute, on-demand	Feature store (batch + streaming)
Threshold	Fixed, dynamic	Dynamic (optimize for business cost)
Explainability	None, SHAP, Rules	SHAP for ML, rule reasons for rules

Problem Statement​

Clarification Questions​

Problem Characteristics​

Class Imbalance​

Adversarial Environment​

Error Costs​

System Architecture​

Feature Engineering​

Transaction Features​

User Behavior Features​

Aggregated Features​

Network/Graph Features​

Model Approaches​

1. Rule-Based System​

2. Supervised Learning​

3. Anomaly Detection​

4. Ensemble Approach​

Evaluation Metrics​

Classification Metrics​

Business Metrics​

Threshold Optimization​

Real-Time Considerations​

Latency Requirements​

Feature Store Architecture​

Model Lifecycle​

Label Collection​

Retraining Triggers​

Explainability​

Summary​

Table of Contents