Hopper Company Queries I - Problem
You are given three tables related to a ride-sharing company called Hopper:
- Drivers: Contains driver information and their join dates
- Rides: Contains all ride requests (both accepted and not accepted)
- AcceptedRides: Contains information about rides that were accepted by drivers
Write a SQL query to report the following statistics for each month of 2020:
- active_drivers: The number of drivers currently with the Hopper company by the end of each month
- accepted_rides: The number of accepted rides in that month
Return the result table ordered by month in ascending order, where month is the month's number (January is 1, February is 2, etc.).
Table Schema
Drivers
| Column Name | Type | Description |
|---|---|---|
driver_id
PK
|
int | Primary key, unique driver identifier |
join_date
|
date | Date when driver joined Hopper company |
Primary Key: driver_id
Rides
| Column Name | Type | Description |
|---|---|---|
ride_id
PK
|
int | Primary key, unique ride identifier |
user_id
|
int | User who requested the ride |
requested_at
|
date | Date when ride was requested |
Primary Key: ride_id
AcceptedRides
| Column Name | Type | Description |
|---|---|---|
ride_id
PK
|
int | Primary key, references ride_id from Rides table |
driver_id
|
int | Driver who accepted the ride |
ride_distance
|
int | Distance of the ride |
ride_duration
|
int | Duration of the ride |
Primary Key: ride_id
Input & Output
Example 1 — Basic Monthly Statistics
Input Tables:
Drivers
| driver_id | join_date |
|---|---|
| 10 | 2019-12-10 |
| 8 | 2020-01-13 |
| 5 | 2020-02-16 |
| 7 | 2020-03-08 |
| 4 | 2020-05-17 |
| 1 | 2020-10-24 |
| 6 | 2021-01-05 |
Rides
| ride_id | user_id | requested_at |
|---|---|---|
| 6 | 75 | 2019-12-09 |
| 1 | 54 | 2020-02-09 |
| 10 | 63 | 2020-03-04 |
| 19 | 39 | 2020-04-06 |
| 3 | 41 | 2020-06-11 |
| 13 | 52 | 2020-06-13 |
| 7 | 69 | 2020-07-16 |
| 17 | 70 | 2020-08-25 |
| 20 | 81 | 2020-11-02 |
| 5 | 57 | 2020-11-09 |
| 2 | 42 | 2020-12-09 |
| 11 | 68 | 2021-01-11 |
| 15 | 32 | 2021-01-17 |
| 12 | 11 | 2021-01-19 |
| 14 | 18 | 2021-02-17 |
AcceptedRides
| ride_id | driver_id | ride_distance | ride_duration |
|---|---|---|---|
| 10 | 10 | 63 | 38 |
| 13 | 10 | 73 | 96 |
| 7 | 8 | 100 | 28 |
| 17 | 7 | 119 | 68 |
| 20 | 1 | 121 | 92 |
| 5 | 7 | 42 | 101 |
| 2 | 4 | 6 | 38 |
| 11 | 8 | 37 | 43 |
| 15 | 8 | 108 | 82 |
| 12 | 8 | 38 | 34 |
Output:
| month | active_drivers | accepted_rides |
|---|---|---|
| 1 | 2 | 0 |
| 2 | 3 | 0 |
| 3 | 4 | 1 |
| 4 | 4 | 0 |
| 5 | 5 | 0 |
| 6 | 5 | 1 |
| 7 | 5 | 1 |
| 8 | 5 | 1 |
| 9 | 5 | 0 |
| 10 | 6 | 0 |
| 11 | 6 | 2 |
| 12 | 6 | 1 |
💡 Note:
For each month of 2020, we calculate:
- active_drivers: Cumulative count of drivers who have joined by the end of that month (including previous years)
- accepted_rides: Count of rides that were both requested and accepted in that specific month
For example, in January 2020, we have 2 active drivers (driver 10 from 2019 + driver 8 who joined in January), but 0 accepted rides in January.
Example 2 — No Activity Months
Input Tables:
Drivers
| driver_id | join_date |
|---|---|
| 1 | 2019-01-01 |
Rides
| ride_id | user_id | requested_at |
|---|---|---|
| 1 | 1 | 2020-01-01 |
AcceptedRides
| ride_id | driver_id | ride_distance | ride_duration |
|---|
Output:
| month | active_drivers | accepted_rides |
|---|---|---|
| 1 | 1 | 0 |
| 2 | 1 | 0 |
| 3 | 1 | 0 |
| 4 | 1 | 0 |
| 5 | 1 | 0 |
| 6 | 1 | 0 |
| 7 | 1 | 0 |
| 8 | 1 | 0 |
| 9 | 1 | 0 |
| 10 | 1 | 0 |
| 11 | 1 | 0 |
| 12 | 1 | 0 |
💡 Note:
Shows edge case where there are no accepted rides throughout 2020, but we still have 1 active driver who joined in 2019. All months show 1 active driver and 0 accepted rides.
Constraints
-
1 ≤ driver_id ≤ 10^6 -
1 ≤ ride_id ≤ 10^6 -
1 ≤ user_id ≤ 10^6 - All dates are valid and within reasonable ranges
- Each accepted ride exists in the Rides table
Visualization
Tap to expand
Understanding the Visualization
1
Input Tables
Drivers, Rides, and AcceptedRides tables
2
CTE Generation
Generate all 12 months and calculate statistics
3
Output
Monthly active drivers and accepted rides
Key Takeaway
🎯 Key Insight: Use CTE to generate complete time series and LEFT JOINs to handle missing data in monthly reporting
💡
Explanation
AI Ready
💡 Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code