It is a common mistake to assume the terms “reliability” and “validity” have the same meaning. While they are related, the two concepts are very different. In an effort to clear up any misunderstandings, I have defined each here for you.
Of the two terms, reliability is the simpler concept to explain and understand. If you are focusing on the reliability of a test, all you need to ask is—are the results of the test consistent? If I take the test today, a week from now and a month from now, will my results be the same?
If an assessment is reliable, your results will be very similar no matter when you take the test. If the results are inconsistent, the test is not considered reliable.
Validity is a bit more complex because it is more difficult to assess than reliability. There are various ways to assess and demonstrate that an assessment is valid, but in simple terms, validity refers to how well a test measures what it is supposed to measure.
There are several approaches to determine the validity of an assessment, including the assessment of content, criterion-related and construct validity.
- An assessment demonstrates content validity when the criteria it is measuring aligns with the content of the job. Also, the extent to which that content is essential to job performance (versus useful-to-know) is part of the process in determining how well the assessment demonstrates content validity.
For example, the ability to type quickly would likely be considered a large and crucial aspect of the job for an executive secretary compared to an executive. While the executive is probably required to type, such a skill is not as nearly as important to performing that job. Ensuring an assessment demonstrates content validity entails judging the degree to which test items and job content match each other.
- An assessment demonstrates criterion-related validity if the results can be used to predict a facet of job performance. Determining if an assessment predicts performance requires that assessment scores are statistically evaluated against a measure of employee performance.
For example, an employer interested in understanding how well an integrity test identifies individuals that are likely to engage in counterproductive work behaviors might compare applicants’ integrity test scores to how many accidents or injuries those individuals have on the job, if they engage in on-the-job drug use, or how many times they ignore company policies. The degree to which the assessment is effective in predicting such behaviors is the extent to which it exhibits criterion-related validity.
- An assessment demonstrates construct validity if it is related to other assessments measuring the same psychological construct--a construct being a concept used to explain behavior (e.g., intelligence, honesty).
For example, intelligence is a construct that is used to explain a person’s ability to understand and solve problems. Construct validity can be evaluated by comparing intelligence scores on one test to intelligence scores on other tests (i.e., Wonderlic Cognitive Ability Test to the Wechsler Adult Intelligence Scale).
Reliable and Valid?
The tricky part is that a test can be reliable without being valid. However, a test cannot be valid unless it is reliable. An assessment can provide you with consistent results, making it reliable, but unless it is measuring what you are supposed to measure, it is not valid.
What are the biggest questions you have surrounding reliability and validity?
Add Michael to your circle