If you've taken a modern occupational test, IRT was probably sitting beneath the bonnet making sense of the responses. Traditional tests count correct responses to give an estimate of your true ability: 30/40 means you ought to be better than if you'd scored 23/40. In contrast, IRT moves the unit of meaning from the test to the test item. Getting one question right or wrong gives us some predictor of ability right off the cuff; a coarse one, admittedly, but increasingly accurate as further responses are given.
Let's say Item E is easy. Someone at or above average should get it, and those below average have a fighting chance. Item H is hard: the chances of a correct answer are low for most, but the chances rocket up for the sharpest. Each item has a different relationship between test-taker's ability and the likelihood of them getting it right: these are the test parameters. (Because I love my readers, I've bodged up a visual example). You don't need to understand the maths to appreciate that armed with these parameters, it quickly becomes possible to home in on the true performance behind the item responses. Potent stuff.
Put Verona aside. The real issue investigated by Chakadee Waiyavutti, Wendy Johnson, and Ian Deary is whether individuals with low IQ respond to personality tests differently. Personality? Yep, IRT is used for these assessments too, in a slightly fiddlier way - item 'difficulty' and right/wrong binaries need to be translated - with the concepts remaining solid. Higher and lower IQ groups do show slight personality differences in aggregate. If these differences were because personality items were understood differently by these different groups, it would call into question the validity of making judgements about personality when testing across ranges of IQ, which would impact occupational testing in a profound way.
Waiyavutti's team drew on a large data set of 683 individuals born in 1936, categorised into two groups with a mean IQ difference of 21 points. Participants completed two personality tests, the NEO-FFI and IPIP (both based on the Big 5 personality factors) and the researchers produced parameters for each item in each group, and analysed whether averaged parameters across the groups were significantly different. They found that while the two groups did differ on average - in expected areas such as Intellect and Openness to Experience and Emotional Stability - the personality test items operated similarly. This gives reassurance that these are meaningful differences.
So: we can be more confident that personality tests (at least these) are operating in the same way in people of differing IQ, making it reasonable to use them to draw their intended insights. And along the way we've figured out something about how modern tests operate. If you want a fuller exploration of IRT, you may be interested in this open-access article in the Psychologist online.