
PM Interview Scorecard: What to Look For
A structured framework for evaluating PM candidates. Use this scorecard to standardize interviews and reduce bias in hiring decisions.
Why Use a Scorecard
Unstructured interviews are unreliable predictors of job performance. They favor candidates who interview well over those who work well.
Scorecards bring structure:
- Every interviewer evaluates the same dimensions
- Uses consistent criteria
- Documents specific evidence
This makes debrief discussions productive and hiring decisions defensible.
Core Dimensions to Evaluate
| Dimension | What It Tests |
|---|---|
| Product Sense | Can they think through product problems from first principles? Do they consider users, alternatives, and tradeoffs? |
| Execution | Can they get things done? Evidence of shipping products, managing stakeholders, and overcoming obstacles? |
| Analytical Ability | Can they use data effectively? Do they think systematically about metrics and experiments? |
| Communication | Can they explain complex ideas clearly? Do they structure their thinking? Do they listen? |
| Collaboration | Do they work well with others? How do they handle conflict? Respect for other functions? |
| Culture Fit | Do they align with your company's values and working style? |
Rating Scale
Use a consistent scale across all dimensions:
| Rating | Meaning |
|---|---|
| 1 | Strong No |
| 2 | No |
| 3 | Leaning No |
| 4 | Leaning Yes |
| 5 | Yes |
| 6 | Strong Yes |
The Midpoint Matters
- Leaning No (3): You'd pass if forced to decide
- Leaning Yes (4): You'd hire if forced
This distinction helps calibrate close calls.
Avoid defaulting to the middle. A 3 on everything tells you nothing. Push yourself to take a position on each dimension.
Evidence-Based Scoring
Every rating needs evidence.
| ❌ Useless | ✅ Useful |
|---|---|
| "Gave a 5 on Product Sense" | "Gave a 5 on Product Sense because they identified three user segments, evaluated tradeoffs explicitly, and challenged my assumptions thoughtfully" |
Write It Down
Immediately after the interview, while it's fresh. Memory degrades quickly.
Specific quotes and examples are best.
Performance vs. Potential
Some candidates are clearly great today; others show potential that could develop.
Note which applies.
Interview-Specific Scoring
Assign Focus Areas
Not every interview evaluates every dimension:
- A product sense interview focuses on product sense
- An execution interview focuses on execution
Assign dimensions to interviews deliberately.
Redundancy
Each dimension should be assessed by at least two interviewers. This provides redundancy and surfaces disagreements.
Interviewer Preparation
Interviewers should know their focus area in advance. This lets them ask relevant questions and evaluate thoroughly.
Sample Scorecard Template
CANDIDATE: [Name]
INTERVIEWER: [Name]
INTERVIEW TYPE: [Product Sense / Execution / Analytical / Culture]
DIMENSION RATINGS (1-6 with evidence):
- Product Sense: [rating] - [evidence]
- Execution: [rating] - [evidence]
- Analytical: [rating] - [evidence]
- Communication: [rating] - [evidence]
- Collaboration: [rating] - [evidence]
- Culture Fit: [rating] - [evidence]
OVERALL: [Hire / No Hire / Unsure]
KEY CONCERNS: [specific issues]
KEY STRENGTHS: [specific positives]
Using Scorecards in Debrief
Collect Before Discussion
Collect scorecards before the debrief discussion. This prevents anchoring—you don't want one person's opinion to influence others before they've formed their own.
Discuss Disagreements
In debrief, discuss dimensions where interviewers disagreed:
"I gave them a 5 on execution; you gave them a 3. Let's compare evidence."
Disagreement reveals signal.
Focus on Evidence
| ✅ Useful | ❌ Less Useful |
|---|---|
| "They couldn't explain what they specifically contributed to the project" | "I just had a bad feeling" |
"Bad feeling" is a data point, but it's less useful than specific evidence.
Common Scorecard Mistakes
Rating Inflation
Everyone's a 4 or 5. This fails to distinguish candidates.
Calibrate your ratings against your bar.
Halo Effect
One strong dimension colors others. They were great at product sense, so you rate communication highly even though you don't have strong evidence.
Recency Bias
The last thing they said weighs too heavily.
Write notes throughout the interview, not just at the end.
Skipping Evidence
Rating without explanation makes debrief and future calibration impossible.
Calibrating Over Time
Track Performance
Track how candidates you hired performed:
- Did your ratings predict success?
- Where were you wrong?
This calibrates future interviews.
Review Successful Hires
Review scorecards of successful hires:
- What patterns do you see?
- These become your bar for future candidates
Train New Interviewers
Have them shadow and score, then compare their ratings to experienced interviewers.
This transfers calibration.
Making the Final Decision
Scorecards Support, Don't Decide
A scorecard doesn't make the decision—you do. But it ensures you've collected the relevant evidence and considered each dimension systematically.
Strong Candidate Profile
- Score 4+ on most dimensions
- No 2s or below
A 2 in any core dimension is usually disqualifying.
Mixed Scorecards
When scorecards are mixed, dig into the disagreements:
- Sometimes one interviewer saw something others missed
- Sometimes their interview was just off
Understand why before deciding.
Related guides
Get the weekly digest of top product people & jobs
One email a week. No spam.