HTA MEASUREMENT Q&A

A number of colleagues have asked if it would be possible to clarify certain terms to make the impact of accepting representational measurement more clear cut, Here are responses to a composite series of questions that were received.

THE IMPORTANCE OF MEASUREMENT

1. WHAT DOES THE PHRASE ‘MEASUREMENT STRUCTURE’ MEAN FOR HEALTH TECHNOLOGY ASSESSMENT?

A measurement structure is not simply a set of numbers assigned to observations. It is the formal representation of an attribute such that numerical assignments preserve the empirical relationships observed in the real world. In other words, a measurement structure defines when numbers can legitimately stand in for quantities and support arithmetic operations in health technology assessment (HTA). This requirement is not met in HTA.

At its core, a measurement structure requires three elements. First, there must be a clearly defined attribute of interest. This attribute must be unidimensional: all observations relate to the same underlying property. Without unidimensionality, any numerical assignment collapses multiple attributes into a composite, and the resulting numbers cannot be interpreted as measures.

Second, there must be an ordering of observations that reflects the empirical structure of the attribute. This involves demonstrating that one observation represents more, less, or the same amount of the attribute as another. For manifest attributes, such as time or resource use, this ordering is often direct. For latent attributes, such as quality of life, the ordering is inferred from responses to items and must be tested for consistency.

Third, and most critically, there must be a mapping from empirical observations to numbers that preserves the structure of the attribute under permissible transformations. This is where representational measurement theory applies. The mapping must satisfy the axioms that define the scale: for a ratio scale, the presence of a true zero and constant unit; for a Rasch-derived scale, invariance of comparisons and meaningful interval differences. Only when these conditions are met do numerical differences correspond to differences in the attribute.

A complete measurement structure therefore specifies: (i) the attribute, (ii) the empirical relational system (ordering and comparisons), and (iii) the numerical relational system (the scale and its permissible transformations). If any of these elements are missing, the result is not measurement but numerical labeling.

In HTA, this distinction is decisive. Claims about therapy impact require that the attributes being assessed, whether manifest (e.g., hospital days avoided) or latent (e.g., symptom burden), are embedded in a valid measurement structure. Without this, arithmetic operations such as aggregation, multiplication, or averaging do not yield interpretable quantities. The appearance of precision is not sufficient; it must be grounded in a structure that ensures numbers correspond to measurable properties.

Thus, a measurement structure is the necessary foundation for any quantitative claim. It determines not only whether numbers can be assigned, but whether those numbers can support inference, comparison, and empirical evaluation, HTA fails to meet this standard,

2, I WAS NEVER TAUGHT THAT MEASUREMENT MUST PRECEDE ARITHMETIC. WHY?

Because training in HTA and applied health economics has largely evolved around methods rather than foundations. Students are taught how to construct QALYs, estimate cost-effectiveness ratios, and implement simulation models, but not whether the quantities involved satisfy the conditions required for measurement. This reverses the scientific order. Arithmetic is treated as if it can create measurement, when in fact it presupposes it.

Measurement establishes the structure of a scale, unidimensionality, invariance, and, where required, a true zero, so that numerical operations have meaning. Without this, numbers are simply assigned labels. Their manipulation may produce results that appear precise, but those results are not anchored to measurable quantities. The omission persists because HTA methods have been institutionalized through teaching, journals, and policy frameworks without reference to measurement theory. Over time, the absence of measurement foundations has become normalized. The result is a discipline in which arithmetic operations are routinely applied without first establishing whether the underlying constructs can support them. The issue is not that the principle is controversial; it is that it has been overlooked with unfortunate consequences for the validity of HTA claims.

3. WHY ARE THERE ONLY TWO CLASSES OF MEASUREMENT?

Because the requirements for meaningful arithmetic leave no intermediate category. For manifest attributes such as time or resource use measurement requires a linear ratio scale with a constant unit and a true zero. Only under these conditions do multiplication and division have interpretable meaning. For latent attributes such as need fulfillment or symptom burden no direct unit exists. Measurement therefore requires a transformation from ordinal observations to an invariant interval structure. The Rasch model is the only established framework that achieves this while satisfying the axioms of representational measurement. It creates a scale where differences are meaningful and comparisons are invariant across persons and items.

There is no third category that allows ordinal scores, composite indices, or ad hoc transformations to support arithmetic. Either a scale meets the requirements for ratio or Rasch-based interval measurement, or it does not. The absence of an intermediate class is not a limitation of theory; it reflects the conditions under which numerical operations acquire meaning. Without these conditions, arithmetic produces numbers that cannot be interpreted as measures.

This is the failure in HTA; the absence of true measurement.

4. WHY DO WE NEED TO RECOGNIZE AND APPLY THE AXIOMS OF REPRESENTATIONAL MEASUREMENT?

Because the axioms or rules define when numbers correspond to quantities. They establish the conditions under which a numerical scale represents an attribute in a way that preserves its structure. These conditions include unidimensionality, invariance, and permissible transformations. Without them, numerical assignments are arbitrary. They may be consistent or convenient, but they do not constitute measurement.

The axioms ensure that differences on a scale are meaningful and that operations such as addition or multiplication preserve those meanings. This is what allows quantitative claims to support inference. If the axioms are not satisfied, then applying arithmetic does not yield interpretable results. It produces numerical outputs that cannot be linked back to the attribute of interest. In this sense, the axioms are not optional or philosophical; they are the formal requirements that distinguish measurement from labeling. Their absence cannot be compensated for by statistical sophistication or model complexity. If the scale does not meet the axioms, then the results cannot be considered measurement-valid, regardless of how they are used.

This is the situation HTA has been in for 40 years

INTERROGATING MEASUREMENT STATUS AND CLAIMS

5. WHEN WE REFER TO A KNOWLEDGE BASE IN HTA, WHAT ARE WE DESCRIBING?

When we refer to a knowledge base in HTA, we are describing the body of concepts, methods, assumptions, and conventions that define how evidence is generated, interpreted, and applied within the field. This includes guidelines, reference cases, journal articles, modeling practices, teaching materials, and institutional frameworks. Importantly, the knowledge base is defined for a specific target; for example, an academic research group, an HTA agency such as National Institute for Health and Care Excellence (NICE), a university department, or a journal. In each case, the knowledge base comprises the materials and practices that characterize that entity’s approach to HTA. These targeted knowledge bases are interrogated using a consistent framework, drawing on the corpus of material available to the large language model within defined boundaries. Each is assessed against the same criteria: whether it recognizes and applies the axioms of representational measurement, allowing comparisons across institutions and jurisdictions.

With the advent of large language AI models, this is the first time it has been possible to assess HTA knowledge bases at this level of granularity. Rather than relying on selective review or anecdotal critique, the interrogation draws on a structured representation of the available corpus to evaluate patterns of endorsement across defined statements. This makes it possible to identify, systematically and reproducibly, whether the underlying beliefs that support HTA practice are consistent with the requirements of measurement.

6. WHY DOES YOUR DIAGNOSTIC ASSESSMENT INSTRUMENT MAKE CLEAR THE ABSENCE OF REPRESENTATIONAL MEASUREMENT AND ITS IMPLICATIONS FOR HTA?

Because the instrument tests directly whether the HTA knowledge base or particular agencies or applications recognizes and applies the axioms of measurement. It does so through a structured set of canonical statements, some reflecting established measurement principles and others contradicting them. The pattern of responses is consistent across institutions and countries. Statements aligned with measurement theory are under-endorsed, while contradictory statements are strongly endorsed. This is not a matter of interpretation; it is a systematic inconsistency.

If the knowledge base does not recognize the conditions required for measurement, then it cannot ensure that the constructs it uses have the necessary scale properties. The implication follows directly. In HTA arithmetic operations such as multiplication, aggregation, averaging are applied to quantities whose measurement status has not been established. The framework therefore generates numbers, but not measures. Without measures, the resulting claims cannot be considered measurement-valid. This is why the issue is described as structural rather than technical. It is not a problem that can be corrected by refining individual models; it arises from the absence of measurement at the level of the framework itself.

Thos is the challenge facing HTA if it is to be recognized as a science.

7. WHY DOES YOUR DIAGNOSTIC ASSESSMENT IMPLY THAT CURRENT HTA BELIEFS ARE WRONG?

Because it identifies a mismatch between what is required for measurement and what is endorsed within the HTA knowledge base. The assessment does not evaluate individual researchers or studies; it evaluates the internal consistency of the framework. When a framework endorses propositions that violate the axioms of measurement such as treating ordinal or composite constructs as if they support arithmetic then the conclusions drawn from that framework cannot be measurement-valid. This is not a matter of opinion or preference. It follows from the formal requirements that govern the use of numbers. The consistency of the results across multiple jurisdictions indicates that this is not an isolated misunderstanding but a shared structure of belief in HTA at the global level. To say that these beliefs are “wrong” is therefore to say that they are inconsistent with the conditions required for measurement. The issue is not intent or competence, but whether the framework aligns with the rules that make quantitative claims possible.

The conclusion for HTA is that the beliefs that support present analytical standards such as reference case simulations do not align with the required rules.

8. WHY MUST FALSIFICATION BE THE ULTIMATE CRITERION FOR EVALUATING THERAPY IMPACT CLAIMS?

Because without the possibility of falsification, claims cannot be tested against reality. Scientific knowledge advances through the proposal and testing of claims that can, in principle, be shown to be false. This requires that claims be expressed in terms that allow empirical evaluation. Measurement is central to this process. If an attribute is not measured, then there is no basis for testing whether a claim about that attribute is correct. Similarly, if claims are derived from models that cannot be empirically evaluated, they are insulated from falsification. In such cases, they function as assertions rather than evidence. Falsification is therefore not an optional standard; it is the condition under which claims can be evaluated, challenged, and improved. In the context of HTA, this means that therapy impact claims must be grounded in measurable quantities and specified in a way that allows empirical testing. Without this, the framework cannot support the evolution of objective knowledge.

Closing statement

Measurement establishes what can be calculated; falsification establishes what can be known.