Questions: Inter-Rater Reliability and Observer Agreement

5 questions to test your understanding

Score: 0 / 5
Question 1 Multiple Choice

Two clinical raters independently assess 100 patients for depression in a clinic where 95% of patients are not depressed. Both raters always code 'not depressed.' What are their percent agreement and Cohen's kappa?

APercent agreement = 95%, kappa ≈ 0
BPercent agreement = 95%, kappa ≈ 0.95
CPercent agreement = 100%, kappa = 1.0
DPercent agreement = 100%, kappa ≈ 0
Question 2 Multiple Choice

A researcher uses percent agreement to report inter-rater reliability for a coding scheme with three behavioral categories used roughly equally (≈33% each). Compared to Cohen's kappa, what is most likely true?

APercent agreement will be lower than kappa, because it ignores systematic rater bias
BPercent agreement will be higher than kappa, because kappa subtracts the expected chance agreement
CPercent agreement and kappa will be equal, because equal base rates eliminate chance agreement
DPercent agreement will be higher than kappa, because kappa penalizes raters for using more than two categories
Question 3 True / False

Cohen's kappa can be 0 even when two raters show high percent agreement, if that agreement is entirely explained by the expected base rate.

TTrue
FFalse
Question 4 True / False

A kappa of .80 is widely accepted as indicating good inter-rater reliability and can be applied as a universal threshold across most measurement contexts.

TTrue
FFalse
Question 5 Short Answer

Why does the prevalence of the categories being rated affect the interpretation of Cohen's kappa, and what problem does this create for researchers using binary diagnostic categories with rare conditions?

Think about your answer, then reveal below.