Find Candidates for Data Scientist Position II - Problem
You are given two tables: Candidates and Projects. Write a SQL query to find the best candidate for each project based on specific scoring criteria.
Requirements:
- Candidates must have all required skills for a project
- Calculate a score starting with 100 points
- Add 10 points for each skill where proficiency > importance
- Subtract 5 points for each skill where proficiency < importance
- No change when proficiency equals importance
- Return only the highest scoring candidate per project
- Break ties by choosing the candidate with the lower candidate_id
- Exclude projects with no suitable candidates
Table Schema
Candidates
| Column Name | Type | Description |
|---|---|---|
candidate_id
PK
|
int | Unique identifier for candidate |
skill
PK
|
varchar | Skill name (e.g., Python, SQL) |
proficiency
|
int | Skill proficiency level (1-5) |
Primary Key: (candidate_id, skill)
Projects
| Column Name | Type | Description |
|---|---|---|
project_id
PK
|
int | Unique identifier for project |
skill
PK
|
varchar | Required skill name |
importance
|
int | Skill importance level (1-5) |
Primary Key: (project_id, skill)
Input & Output
Example 1 — Multiple Projects and Candidates
Input Tables:
Candidates
| candidate_id | skill | proficiency |
|---|---|---|
| 101 | Python | 5 |
| 101 | Tableau | 3 |
| 101 | PostgreSQL | 4 |
| 101 | TensorFlow | 2 |
| 102 | Python | 4 |
| 102 | Tableau | 5 |
| 102 | PostgreSQL | 4 |
| 102 | R | 4 |
| 103 | Python | 3 |
| 103 | Tableau | 5 |
| 103 | PostgreSQL | 5 |
| 103 | Spark | 4 |
Projects
| project_id | skill | importance |
|---|---|---|
| 501 | Python | 4 |
| 501 | Tableau | 3 |
| 501 | PostgreSQL | 5 |
| 502 | Python | 3 |
| 502 | Tableau | 4 |
| 502 | R | 2 |
Output:
| project_id | candidate_id | score |
|---|---|---|
| 501 | 101 | 105 |
| 502 | 102 | 130 |
💡 Note:
For Project 501: Candidate 101 scores 100 + 10 (Python 5>4) + 0 (Tableau 3=3) + (-5) (PostgreSQL 4<5) = 105. For Project 502: Candidate 102 scores 100 + 10 (Python 4>3) + 10 (Tableau 5>4) + 10 (R 4>2) = 130. Candidate 103 doesn't have R skill, so is disqualified from Project 502.
Example 2 — No Qualified Candidates
Input Tables:
Candidates
| candidate_id | skill | proficiency |
|---|---|---|
| 201 | Python | 3 |
| 201 | SQL | 4 |
Projects
| project_id | skill | importance |
|---|---|---|
| 601 | Python | 4 |
| 601 | R | 3 |
| 601 | Machine Learning | 5 |
Output:
| project_id | candidate_id | score |
|---|
💡 Note:
Candidate 201 only has Python and SQL skills, but Project 601 requires Python, R, and Machine Learning. Since the candidate doesn't have all required skills, they are disqualified and no result is returned for this project.
Constraints
-
1 ≤ candidate_id ≤ 1000 -
1 ≤ project_id ≤ 1000 -
1 ≤ proficiency ≤ 5 -
1 ≤ importance ≤ 5 -
skillnames are case-sensitive varchar strings
Visualization
Tap to expand
Understanding the Visualization
1
Join & Filter
Match candidates with projects, ensure all skills present
2
Calculate Scores
Apply scoring rules based on proficiency gaps
3
Rank & Select
Choose highest scoring candidate per project
Key Takeaway
🎯 Key Insight: Use JOINs with HAVING to ensure complete skill coverage before applying complex scoring logic
💡
Explanation
AI Ready
💡 Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code