THE REFERENCE CASE
THE IMPLICATIONS FOR CLOSURE
INTRODUCTION
In the history of science it is difficult to identify an analytical framework that has dominated a
field for more than half a century while remaining fundamentally incompatible with the
standards of measurement and normal scientific inquiry. Yet this appears to be the position
occupied by the reference case in contemporary health technology assessment (HTA). For more
than 40 years the reference case has served as the accepted framework for evaluating therapies,
informing reimbursement decisions and guiding the allocation of healthcare resources. During
this period it has acquired the status of orthodoxy. Entire curricula, research programs,
professional organizations and regulatory agencies have been constructed around its assumptions
and methods.
The origins of the reference case are readily understood. Healthcare systems face a perennial
political and administrative problem: resources are finite while demands for healthcare are
effectively unlimited. Decision makers therefore sought a framework that would allow
competing claims for healthcare expenditure to be compared through a common metric. The
solution appeared to be the utility-based evaluation of health states, the construction of quality-
adjusted life years (QALYs) and the subsequent estimation of cost-effectiveness through
simulation modelling. What emerged was an apparently coherent framework capable of
generating numerical estimates that could be used to support decisions regarding coverage,
pricing and resource allocation.
The attraction of the reference case lay in its apparent simplicity. Diverse diseases, therapies and
patient populations could be translated into a common evaluative language. A single metric
appeared to offer a solution to the problem of comparing fundamentally different health
interventions. For policy makers, reimbursement agencies and academic researchers, the
framework promised consistency, transparency and analytical rigor. Over time these assumptions
became institutionalized within HTA education and practice.
The very feature that ensured the success of the reference case ultimately became its greatest
weakness. The framework rests upon the valuation of health states through preference-based
instruments that generate utility scores. These scores are then combined with time to create
QALYs and subsequently incorporated into simulation models to generate estimates of future
cost-effectiveness. The difficulty is that the measurement properties required for these operations
were never established. The framework begins with preference scores that fail the requirements
of ratio measurement and proceeds through a sequence of arithmetic operations that assume the
very measurement properties that have yet to be demonstrated.
The consequence is profound. The problem is not simply that particular assumptions may be
questioned or that alternative modelling strategies may produce different results. The problem
lies at the foundations of the framework itself. If the quantities entering the reference case do not
satisfy the requirements of measurement, then the resulting claims cannot be rescued by
increasingly sophisticated statistical methods, more elaborate simulation models or revised
methodological guidelines. Measurement failure at the point of origin propagates throughout the
entire analytical structure.
For many years this weakness remained largely invisible because the reference case was judged
primarily in terms of administrative usefulness rather than scientific validity. Only recently has
attention turned to the relationship between measurement theory, representational measurement
and HTA practice. The result has been the recognition that the reference case does not merely
contain methodological weaknesses; it embodies a systematic form of measurement inversion in
which arithmetic precedes measurement and numerical constructions are mistaken for evidence.
The implications extend beyond methodology to education itself. If the reference case defines the
curriculum, then the concepts necessary to evaluate the reference case are inevitably
marginalized. Understanding how this occurred, and how HTA might move beyond it, requires a
reassessment not only of analytical methods but also of the educational foundations upon which
the discipline has been built.
MEASUREMENT INVERSION AND A FRAGMENTED CURRICULUM
For many years the reference case paradigm appeared largely immune to systematic criticism.
Although individual components of the framework were frequently debated, there was no
practical mechanism for evaluating the coherence of the paradigm as a whole or for assessing
whether the educational foundations supporting it were consistent with the standards of
measurement and normal science. The emergence of artificial intelligence large language models
(LLMs) changed this situation fundamentally. For the first time it became possible to interrogate
extensive knowledge bases rapidly, consistently and at scale, allowing the assumptions
embedded within HTA organizations, research centers and educational programs to be examined
in a systematic manner.
The interrogations undertaken by Maimon Research were designed around two related
objectives. The first was to assess understanding of measurement. The second was to evaluate
curriculum content. Together these interrogations sought to determine whether the concepts
required to support lawful quantitative claims were present within the knowledge bases that
shape HTA education, research and policy.
The measurement interrogations focused on a series of canonical statements drawn from
representational measurement, scale theory, latent variable measurement and the philosophy of
science. Questions addressed topics such as scales of measurement, unidimensionality, ratio
measurement, admissible transformations, dimensional homogeneity, latent attribute possession
and scientific falsification. The objective was not simply to identify isolated areas of
misunderstanding but to determine whether a coherent understanding of measurement existed
within the HTA knowledge base. The results were striking. Across countries, agencies, academic
cent and professional organizations, the same pattern emerged repeatedly. Quantitative claims
were accepted while the principles required to justify those claims were either weakly
represented or absent altogether. This pattern was described as measurement inversion:
arithmetic preceding measurement rather than measurement preceding arithmetic.
The curriculum interrogations provided a complementary perspective. Rather than focusing on
methodological claims, they examined the educational foundations from which those claims
emerged. The central question was straightforward: were students and researchers being exposed to the concepts necessary to recognize and avoid measurement inversion? The results suggested
that they were not. Across multiple institutions there was limited evidence that curricula
systematically addressed representational measurement, scale theory, unidimensionality, manifest
and latent attributes, ratio measurement or the standards governing scientific claims. At the same
time, there was extensive emphasis on utilities, QALYs, simulation models and reference-case
methodology.
These findings suggest that measurement inversion is not simply a methodological anomaly. It
appears to be the predictable consequence of a fragmented curriculum shaped by the
requirements of the reference case. The concepts necessary to implement the reference case are
emphasized, while the concepts necessary to evaluate its validity are marginalized or absent. As
a consequence, generations of students and researchers have been trained to operate within the
framework without being exposed to the standards by which the framework itself should be
judged.
The significance of these findings extends beyond criticism of existing practice. They provide a
plausible explanation for the remarkable durability of the reference case despite its apparent
failure to satisfy the requirements of measurement and normal science. More importantly, they
point directly to the solution. If measurement inversion is rooted in curriculum design, then
reconstruction of HTA must begin with reconstruction of the curriculum itself.
A RECONSTRUCTED CURRICULUM
REFERENCES:
Langley P. The end of the Reference Case: Reconstructing HTA. Logit Working Paper No 345 June 2026
Langley P. Health Technology Assessment -Aa 40-year legacy of measurement inversion for
manifest and latent attribute claims. Logit Working Paper No. 785
