David's blog

Err and err and err but less and less and less

David's blog

Err and err and err but less and less and less

Month: January 2025

How Are P-values Distributed Under The Null?

I sometimes use this fun interview question for aspiring data scientists: How are p-values distributed assuming the null hypothesis is true? I’ve heard a lot of reasonable answers, including: All very reasonable and intuitive answers which I would probably, at some point, have given myself. They’re also all wrong. The (perhaps surprising) answer is that […]

Your Classifier Is Broken, But It Is Still Useful

When you run a binary classifier over a population you get an estimate of the proportion of true positives in that population. This is known as the prevalence. But that estimate is biased, because no classifier is perfect. For example, if your classifier tells you that you have 20% of positive cases, but its precision […]

Scroll to top