I needed to hire a new salesperson, and one resume stood out like a sore thumb. The applicant, Ari, was a math major and built robots in his spare time—clearly not the right skill set for sales. But my boss thought Ari looked interesting, so I called him in for an interview. Sure enough, he bombed it.
I reported back to my president that although Ari seemed like a nice guy, during the 45-minute interview, he didn’t make any eye contact. It was obvious that he lacked the social skills to build relationships with clients. I knew I was in trouble when my president started laughing. “Who cares about eye contact? This is a phone sales job.”
We invited Ari back for a second round. Instead of interviewing him, a colleague recommended a different approach, which made it clear that he would be a star. I hired Ari, and he ended up being the best salesperson on my team. I walked away with a completely new way of evaluating talent. Ever since, I’ve been working with organizations on rethinking their selection and hiring processes.
Interviews are terrible predictors of job performance. Consider a rigorous, comprehensive analysis of hundreds of studies of more than 32,000 job applicants over an 85-year period by Frank Schmidt and Jack Hunter. They covered more than 500 different jobs— including salespeople, managers, engineers, teachers, lawyers, accountants, mechanics, reporters, farmers, pharmacists, electricians, and musicians—and compared information gathered about applicants to the objective performance that they achieved in the job. After obtaining basic information about candidates’ abilities, standard interviews only accounted for 8% of the differences in performance and productivity. Think about it this way: imagine that you interviewed 100 candidates, ranked them in order from best to worst, and then measured their actual performance in the job. You’d be lucky if you put more than eight in the right spot.
Interviewer biases are one major culprit. When I dismissed Ari, I fell victim to two common traps: confirmation bias and similarity bias. Confirmation bias is what leads us to see what we expect to see—we look for cues that validate our preconceived notions while discounting or overlooking cues that don’t match our expectations. Since I had already concluded that Ari wasn’t cut out for sales, I zeroed in on his lack of eye contact as a signal that I was right. It didn’t occur to me that eye contact was irrelevant for a phone sales job—and I didn’t notice his talents in building rapport, asking questions, and thinking creatively. Once we expect a candidate to be strong or weak, we ask confirming questions and pay attention to confirming answers, which prevents us from gauging the candidate’s actual potential.
Why did I form this expectation in the first place? Extensive research shows that interviewers try to hire themselves: we naturally favor candidates with personalities, attitudes, values, and backgrounds to our own. I was a psychology major with hobbies of springboard diving, performing magic, and playing word games, and I had done the sales job the previous year. Ari was a robot-building math major, so he didn’t fit my mental model of a salesperson. He wasn’t Mini-Me.
After writing Blink, Malcolm Gladwell became so concerned about his own biases that he removed himself from the processing of interviewing assistants altogether. And even if we take steps to reduce interviewer bias, there’s no guarantee that applicants will share information that accurately forecasts their performance. One challenge is impression management: candidates want to put their best foot forward, so they tend to give the answers that are socially desirable rather than honest. Another challenge is self-deception: candidates are notoriously inaccurate about their own capabilities. Consider these data pointssummarized by psychologist David Dunning and colleagues:
(1) High school seniors: 70% report having “above average” leadership skills, compared with 2% “below average,” and when rating their abilities to get along with others, 25% believe they were in the top 1% and 60% put themselves in the top 10%.
(2) College professors: 94% think they do above-average work.
(3) Engineers: in two different companies, 32% and 42% believe their performance was in the top 5% in their companies.
(4) Doctors, nurses, and surgeons: for treating thyroid disorders, handling basic life support tasks, and performing surgery, there is no correlation between what healthcare professionals say they know and what they actually know.
Overall, Dunning and colleagues estimate that employees’ self-ratings only capture about 8% of their objective performance. Also, the data show that the most unskilled candidates are the least aware of their own incompetence. The less you know in a given domain, the less qualified you are to judge excellence in that domain. The punch line: candidates are not reliable sources of information about their talents. As Timothy Wilson concludes in Strangers to Ourselves, “people often do not know themselves very well.”
The good news is that interviews can be improved. Schmidt and Hunter found that whereas an unstructured interview only explains 8% of the variance in job performance, a structured interview explains 24%. A structured interview involves asking every applicant the same questions and evaluating the responses using a standardized scoring system. If you ask different questions to each candidate, you’ll end up comparing apples and oranges—it’s like a teacher giving out five different math tests, grading them with different criteria, and then trying to figure out which students are the best at math. To form meaningful opinions about candidates, you need to give them all the same opportunity to perform.
The most effective questions are called situational judgment questions. Instead of asking candidates to describe how they handled a unique situation in a previous job or organization, it’s more fruitful to describe consistent situations that candidates could face in this job or organization, and ask them what they would do—or how they would reason. For example, in my own research, I wanted to assess whether it was possible to screen out applicants who tend to operate like takers (people who aim to get more than they give), and focus on hiring matchers (people who like trading favors evenly) and givers (those who enjoy helping without strings attached). If I asked applicants directly, few would admit to being takers, so I designed a situational judgment test. Here’s a sample question:
A few years ago, you helped a colleague named Jamie find a job. You’ve been out of touch since then. All of a sudden, Jamie sends an email introducing you to a potential business partner. What’s the most likely motivation behind Jamie’s email?
(a) Jamie genuinely wants to help me
(b) Jamie wants to pay me back
(c) Jamie wants to ask me for help again
Evidence shows that under ambiguity, we tend to project our own motivations onto others. When we encounter a behavior from a stranger that has multiple explanations, we choose the one that captures how we would behave. This means that we can indirectly assess taking, matching, and giving motives by asking applicants to make sense of others’ behavior. In this question, the most common choice is (b)—most people select matching as the default for professional exchange—and a few givers opt for (a). The takers are more likely to choose (c), believing that there’s a self-serving motive lurking behind this seemingly generous act. It works especially well because it’s unclear what the “right” answer is; takers believe that most people are fundamentally selfish. Of course, we can’t rely on a single question alone, any more than we could accurately assess someone’s knowledge of irrelevant facts with one Trivial Pursuit question. A good situational judgment test typically involves at least half a dozen questions, and sometimes as many as 25.
Interestingly, these situational questions don’t require interviews at all; they can be completed and scored online. Researchers have validated situational tests to assess applicant characteristics as diverse as integrity, personal initiative, emotional intelligence, andaggression. In one study with Dane Barnes of Optimize Hire, we gave salespeople a situational questionnaire about their tendencies to take initiative, and then tracked their annual revenue.
Here’s what we found:
- Low initiative: $53,798
- Moderate initiative: $118,808
- High initiative: $155,663
On average, employees with moderate initiative scores generated more than double the annual revenue of those with low initiative scores, and those with high initiative scores brought in 30% more annual revenue. The higher the initiative, the more proactive and persistent salespeople were in identifying new customer bases, learning about untapped needs, pitching new products, and overcoming barriers to sales. The high scorers were also more than five times less likely to quit than the low scorers. In another study, we gave several hundred satellite dish technicians an online aggression questionnaire. Over the next nine months, those who scored high in aggressive reasoning were more than twice as likely to quit or be terminated. This is consistent with evidence that aggressive employees are more inclined to justify withdrawal and less likely to feel responsible for staying with the organization.
Rather than wasting a great deal of time and money interviewing applicants, you can learn a great deal about them from online situational judgment tests. To write a situational judgment test that’s tailored to your own arena, start by writing short descriptions of the types of situations that distinguish your stars from average and poor performers. Then, give the scenarios to some of your colleagues, and ask what they would do. You can create an answer key by scoring the responses from the star performers at the top, and the responses from the poor performers at the bottom. From there, you can test-drive the questions with a pool of applicants. See if you get a range of answers, and if the candidates with higher scores end up being better performers on the job. Once you have some evidence, use the test to weed out the applicants with the poorest results. Research by Rick Jacobs, an industrial-organizational psychologist and founder and CEO of EB Jacobs, suggests that the cost of a bad hire is often double the benefit of a good hire, so it’s most valuable to screen out the lowest-scoring candidates.
At that point, you can invite the higher-scoring applicants to visit your organization. But why spend your time interviewing them when there’s a better way to assess their potential? It’s called a work sample. Ask applicants to bring actual examples of tasks they’ve completed, products they’ve created, or data on results they’ve achieved. If you’re hiring salespeople, ask for their revenue data and either a presentation video or a slide deck. If you’re evaluating teachers, obtain their recent teaching evaluations. When selecting computer programmers or engineers, ask them for programs they’ve built or drawings they’ve done. Past behavior is a strong predictor of future behavior, so if candidates have worked in a similar job, these work samples are immensely useful. Consider an analogy to baseball: what most managers do in interviews is the equivalent of trying to forecast a pitcher’s performance by asking him about his throwing ability. As Moneyball reminds us, it’s easier to predict a pitcher’s performance in the pros by looking at his statistics from college or high school.
But sometimes the past work is confidential or not recorded—or we just need to hire people with no relevant experience. This was the challenge I faced with Ari: he had never done anything like sales. My colleague Brad Olson introduced me to the solution: I could create my own work sample by giving Ari a simulated sales task. Brad’s advice was to ask Ari, and the other candidates, to sell me a rotten apple. “If he can sell that, he can sell anything.” When I gave Ari the task, he demonstrated an impressive ability to think creatively and sell persuasively on the spot. “This may look like a rotten apple, but it’s actually an aged apple. You can use the seeds to plant a beautiful apple tree in your backyard—cheaper than the seeds normally are.” (This is the second-best apple pitch I’ve ever seen, behind only the apple that sold on eBay for thousands of dollars based on a claim that it had Tiger Woods’ DNA.)
Once we saw Ari sell, we had the chance to directly observe his motivation and ability. The evidence on simulations shows that if they capture job-relevant behaviors, they reveal a great deal about a candidate’s potential. And if you design them right, they can become remarkably efficient. Here’s an example inspired by Change to Strange by Dan Cable. Imagine that you’re GE, and you want to hire people to write technical manuals on how to put together a toaster oven. Some candidates have more experience than others, so you level the playing field by putting a working lawn mower in front of them. Their task is to write a manual for how to put a lawn mower together. It takes a typical applicant two hours—hardly a time-saving assessment. But then you notice that the candidates who go on to be your star writers did something different in the first five minutes than everyone else.
They took the lawn mower apart. Immediately, you know that they have the intuition to take the customer’s perspective, and you can cancel the rest of the simulation. Now, it takes you just five minutes to screen applicants on customer orientation. (For other excellent examples of simulations designed by some of the world’s greatest talent scouts—from Google and Facebook to the U.S. Special Forces and Teach For America—see The Rare Find by George Anders.)
Although scientific evidence supports the validity of situational judgment tests, work samples, and simulations, relatively few organizations use them. This creates a competitive advantage for those that are ahead of the curve: they’re better at spotting diamonds in the rough and screening out applicants who talk a good game but won’t ultimately deliver. But imagine a world in which every organization relied on these tools. Should we eliminate interviews altogether?
Maybe not. Interviews aren’t great for assessing job performance, but they can reveal quite a bit about cultural fit. When I work with organizations on redesigning selection systems, my advice is to start with situational judgment tests, work samples, and/or simulations. That should narrow the pool to a set of applicants with the attributes necessary to succeed in the job. At that point, an interview might be a useful way to figure out whether this person is a good match for the organization.
The legal implications of using scientific assessments vary from country to country. In the U.S., the law states that employers are obligated to hire on attributes that are relevant to job performance. Since situational judgment tests, work samples, and simulations are better predictors of performance than interviews, the law actually favors these approaches. In fact, the Supreme Court has ruled in support of aptitude and ability tests that are professionally developed.
Organizations place far too much weight on interviews. It’s time for the pendulum to swing in the other direction. Instead of assessing how well people talk, let’s observe how they work.
For more on assessing talent, see Adam’s new book Give and Take: A Revolutionary Approach to Success, a New York Times and Wall Street Journal bestseller. Follow Adam on Twitter @AdamMGrant
Image credit: Austin Powers