Top Percentile Fraud - Problem

The Leetcode Insurance Corp has developed an ML-driven predictive model to detect the likelihood of fraudulent claims. They allocate their most seasoned claim adjusters to address the top 5% of claims flagged by this model.

Write a solution to find the top 5 percentile of claims from each state. Return the result table ordered by state in ascending order, fraud_score in descending order, and policy_id in ascending order.

The Fraud table contains:

  • policy_id - unique policy identifier
  • state - state where the policy is issued
  • fraud_score - ML model fraud likelihood score

Table Schema

Fraud
Column Name Type Description
policy_id PK int Unique policy identifier
state varchar State where policy is issued
fraud_score int ML model fraud likelihood score
Primary Key: policy_id
Note: Each row represents a policy with its fraud risk assessment

Input & Output

Example 1 — Multiple States with Top 5%
Input Table:
policy_id state fraud_score
1 CA 95
2 CA 88
3 CA 75
4 NY 92
5 NY 89
6 NY 82
7 TX 98
Output:
policy_id state fraud_score
1 CA 95
4 NY 92
7 TX 98
💡 Note:

From CA (3 policies): top 5% includes policy 1 with highest score 95. From NY (3 policies): top 5% includes policy 4 with highest score 92. From TX (1 policy): policy 7 is automatically in top 5%. Results ordered by state ASC, fraud_score DESC, policy_id ASC.

Example 2 — Tied Fraud Scores
Input Table:
policy_id state fraud_score
1 FL 90
2 FL 90
3 FL 85
4 FL 80
Output:
policy_id state fraud_score
1 FL 90
2 FL 90
💡 Note:

Both policies 1 and 2 have the same fraud_score of 90, so they both get percentile rank 0.00 (tied for first place). Since 2 out of 4 policies = 50%, but they're tied at the top, both are included in the top 5% group.

Constraints

  • 1 ≤ policy_id ≤ 100000
  • 1 ≤ fraud_score ≤ 100
  • state consists of valid US state abbreviations
  • Each state has at least 1 policy

Visualization

Tap to expand
Top 5% Fraud Claims by StateInput Datapolicy_idstatescore1CA952CA883NY924NY89PERCENT_RANK()PARTITION BY stateORDER BY score DESCTop 5% Resultspolicy_idstatescore1CA953NY92Only policies with percentile_rank ≤ 0.05 selected
Understanding the Visualization
1
Input
Fraud table with policy_id, state, fraud_score
2
Percentile Rank
Calculate PERCENT_RANK per state
3
Filter
Keep top 5% (≤ 0.05) per state
Key Takeaway
🎯 Key Insight: Use PERCENT_RANK() to identify relative position within each group partition
Asked in
Amazon 15 Microsoft 12 Google 8
28.4K Views
Medium Frequency
~18 min Avg. Time
890 Likes
Ln 1, Col 1
Smart Actions
💡 Explanation
AI Ready
💡 Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen