Ann Intern Med

Lung cancer risk models beat USPSTF criteria

July 2, 2026

Clinical takeaway: Consider a validated risk model, rather than age and year cutoffs alone, to decide who to refer for low-dose CT screening. It catches more future cancers per person screened but be aware this still understates risk in Black patients.

Screening guidelines determine who gets a low-dose CT. The 2021 U.S. Preventive Services Task Force (USPSTF) criteria use age and pack-years to draw the line. A promising alternative is to swap those cutoffs for individualized risk models, but nearly all of them were built and tested in primarily White populations.

This study ran 16 models across 641,830 US adults with a smoking history in four racial and ethnic groups. Every one of them identified patients for screening more efficiently than the USPSTF criteria and narrowed the gaps between groups, though all still understated risk in Black patients.

When each model was set to flag the same number of patients as the USPSTF criteria, all 16 improved estimated efficiency, measured as the number needed to screen to catch one cancer. PLCOm2012 and LYFS-CT performed best, at a mean estimated 36.5 and 40.1 patients screened per case, respectively, compared to 54.7 for the USPSTF criteria. Every model also narrowed the spread in efficiency across the four groups.

But the models still underestimated risk in Black patients. Eleven of 16 predicted fewer than 75 cases for every 100 that occurred, and 11 showed their worst calibration in this group. Discrimination, the ability to separate future cases from non-cases, was weakest in Asian patients for 13 of 16 models. One exception stood out: ALARM, built in a Chinese population, discriminated better among Asian patients than any model developed in the West.

The cohort study drew on adults aged 50 to 80 with a smoking history from 12 US cohorts in the Lung Cancer Cohort Consortium, including 6,390 Asian, 9,781 Hispanic, 39,872 non-Hispanic Black, and 585,787 non-Hispanic White participants. Researchers measured how well each model predicted the right number of cancers (calibration) and separated future cases from non-cases (discrimination), then set a threshold for each to flag the same share of patients as the USPSTF criteria and compared efficiency across groups.

The practical case for risk models is that they match screening to actual risk instead of a single smoking cutoff, which is why they caught more cancers per person screened. The authors backed PLCOm2012 and LYFS-CT for clinical use through the lens of reducing disparities. But the choice of metric drives the conclusion: they optimized for efficiency, and prioritizing equal eligibility or equal sensitivity across groups would point to different models. No single model got everything right.

Asian and Hispanic patients were underrepresented, so estimates for those groups were less certain, and the models generally fell short of the eligibility and sensitivity they reached in Black and White patients.

The authors stop short of naming a single model for universal use. Building one that calibrates and discriminates well in every group would take a larger, more diverse development cohort than exists today, and questions of eligibility uptake, adherence, and access sit outside what any eligibility model can fix.

Source: Feng X, et al. (Jun 30 2026) Ann Intern Med. Performance of lung cancer risk prediction models in different racial and ethnic groups in the United States: results from the Lung Cancer Cohort Consortium