# Test Item Reliability

Test item reliability indicates how consistent the results produced from items on a test are. Consistency can refer to the items’ stability over time or the consistency of the items with each other. If an item is unreliable, statistical relationships will be weaker than they really are, and inappropriate conclusions may be drawn regarding the relationships between variables.

A measurement of reliability consists of the extent to which an observed score (which is the true score plus or minus error) accurately reflects the true score. Returning to the example in this week’s Introduction, if your true weight were 150 pounds and you stepped on the scale hundreds of times, it would sometimes show 149, sometimes 152, and sometimes 151. If you averaged all of those weights, you would come close to your true score. If you looked at how much the weights varied, you would have a good measure of the scale’s error. The situation is similar with a psychological test—a score on an IQ test represents an estimate of the theoretical “true” IQ; however, that observed score also includes error.

Researchers or test developers measure a test’s reliability with a reliability coefficient, generally a positive correlation coefficient that is somewhat less than 1.00. (A correlation of 1.00 would indicate perfect correlation, which is theoretically impossible due to inherent error in measurement.) Acceptable reliability coefficients for psychological tests or test items are generally at least .70. If you know a test’s reliability, you can calculate its margin of error, a “plus or minus” band that indicates an interval likely to contain the true score.

For this week’s Discussion, think of a specific testing scenario in an organization. Then consider a reliable test item for that testing scenario and an unreliable item for that same testing scenario. Consider how you might know if these items are reliable or unreliable.

With these thoughts in mind:

Post by Day 4 a brief description of a specific organizational testing scenario. Then describe one reliable test item and one unreliable item for that testing scenario. Finally, explain what determines whether an item is reliable or unreliable within the scenario you presented. Support your response using the Learning Resources and the current literature.

Be sure to support your postings and responses with specific references to the Learning Resources.