Distance Education Programs
INTRODUCTION
For fifty years, health technology assessment has practiced numerical storytelling by confusing numbers with measures. To function as a science, HTA must accept the axioms of representational measurement theory (RMT): following Stevens (1946), who tied arithmetic to scale type, and completed by Krantz, Luce, Suppes, and Tversky (1971) with representation and uniqueness theorems. In parallel, Rasch (1960) supplied the probabilistic bridge for latent traits; Wright (1977) showed how ordered responses can be transformed into a logit ruler with specific objectivity when the model fits. HTA could have adopted these foundations at any time; instead, fixation on QALYs and the valuation of multiattribute health-state descriptions, contrary to the requirement of unidimensionality and the other axioms of RMT, guaranteed comprehensive measurement failure.
Health technology assessment’s choice to value health-state descriptions, an approach that persists, has ensured that therapy-impact claims built on utilities, QALYs, and reference-case models fail basic measurement standards. By bypassing the axioms that license arithmetic (unidimensionality, additivity, solvability, the Archimedean property, cancellation, and invariance), the field forfeits dimensional homogeneity and any interval or ratio meaning. The essential point is simple: only when an empirical system satisfies these axioms can observations be mapped to numbers that function as measures and support falsifiable claims.
Two Programs are available; these will be followed by further programs on the implementation of protocols to support specific therapy impact claims, both objective physical claims (e.g., resource utilization) and latent traits (e.g., need fulfillment) and the development of formulary submission guidelines for new therapies. Each of the two programs comprises 5 modules each with supporting questions and answers. Significant input from colleagues have encouraged a modular format so these materials can support graduate instruction, faculty seminars, and focused discussions on measurement failure and reform. Zoom seminars can be arranged on request. Participants are recommended to download the material; this will amount to just over 100 pages for each program.
Dr Langley, the author of these programs, is an economist. He received his undergraduate training in the UK and his M.A and Ph.D in Canada. He has taught microeconomics, labor economics and health economics in the UK, Canada, Australia and the United States. His main interest is in measurement theory; the application of representational measurement for health system claims for therapy impact. He holds an Adjunct Professor position in the College of Pharmacy, University of Minnesota and is Director of Maimon Research (www.maimonresearch.com ) a boutique consulting company in health technology assessment. He is based in Tucson, Arizona. Please direct communications to: langleylapaloma@gmail.com regarding these two programs.
The link to registration and payment (US$65.00) for a program is provided at the end of each program description.
PROGRAM 1
NUMERICAL STORYTELLING: SYSTEMATIC MEASUREMENT FAILURE IN HEALTH TECHNOLOGY ASSESSMENT
HTA can be dismissed in a sentence: it confuses numbers with measures. In science, a string of numerals becomes a measure only when it preserves the empirical structure of an attribute and obeys the transformation rules set out by representational measurement theory. Those axioms, order, additivity, solvability/cancellation, invariance, are what license arithmetic. Without them, subtraction, averaging, ratios, and products are illegitimate. HTA’s main artifacts ignore this gate. Utilities derived from preference tasks lack interval meaning; multiplying them by time to make QALYs violates dimensional homogeneity; disease-specific totals are summed scores that have never earned equal units; cost composites bundle heterogeneous quantities. Rasch modeling shows how latent attributes can be measured lawfully, but HTA rarely demands it. The result is numerical storytelling dressed as evaluation: outputs that look precise yet have no admissible arithmetic. Until HTA requires evidence that its numbers are measures, its claims are not science but policy theater.
MODULE 1: WHY STEVENS? THE CONTEXT OF 1946
Before 1946, measurement beyond physics lacked a warrant. Campbell’s concatenation explained additivity for manifest quantities; psychophysics pursued sensation mappings; operationalism equated meaning with procedures, but none guaranteed that numerals preserved structure or licensed arithmetic, especially for latent traits. Stevens answered the question by tying scale types, nominal, ordinal, interval, ratio, to allowable arithmetic and statistical operations. He did not solve how to establish unidimensional invariant rulers for latent attributes. Foundations of Measurement supplied proof architecture; Rasch modeling supplied the method.
MODULE 2: AXIOMS OF REPRESENTATIONAL MEASUREMENT THEORY
From 1946 to 1971, measurement theory advanced from Stevens’ typology to axioms. Suppes formalized extensive measurement, deriving additivity from concatenation. Luce and Tukey’s conjoint analysis identified cancellation, solvability, and Archimedean conditions under which attributes admit additive representation without concatenation. Krantz, Luce, Suppes, and Tversky unified these results in Foundations of Measurement, proving representation and uniqueness theorems tying meaning to admissible transformations from empirical structures to measurement. In parallel, Rasch modeling operationalized latent measurement by producing logit rulers when data fit.
MODULE 3: SUSTAINED MEASUREMENT FAILURE: THE TIME TRADE-OFF (TTO) TECHNIQUE, THE EQ-5D-3L PREFERENCE ALGORITHM AND PREFERENCE UTILITIES
Time trade-off guaranteed HTA’s measurement failure by valuing multiattribute health-state descriptions; producing numbers not a coherent basis for applying RMT axioms. TTO outputs are regressed on dummy-weighed regression coefficients in a utility algorithm to create a single “utility” for any profile. The result is a further series of numbers, not a measure: unidimensionality is violated by design, additivity is imposed without conjoint/cancellation tests, invariance collapses across protocols and countries, and task quirks manufacture negative “worse-than-dead” values. Because the axioms fail at the start, these utilities are guaranteed non-measures and, as such, cannot support arithmetic operations.
MODULE 4: SUSTAINED MEASUREMENT FAILURE—THE IMPOSSIBLE QALY AND THE CHIMERICAL REFERENCE CASE
The QALY multiplies time, a ratio measure, by utilities drawn from a multiattribute health state algorithm. Again, these are only numbers and have no status in measurement theory. They lack unidimensionality, equal units, invariance, and a defensible zero; applying them to discount time creates a construct, a QALY, which has no meaning. The reference case institutionalizes the error by mandating cost-per-QALY models and treating outputs as evidence. Thresholds, sensitivity analyses deliver precision without meaning because the QALY is not a measure. The reference case supports numerical storytelling by ignoring the requirements of fundamental measurement. Without the ability to support arithmetic and the range of statistical operations on unidimensional claims the reference case is just chimerical numerical storytelling.
MODULE 5: THE IDENTITY CRISIS OF HTA—NOTHING WITHOUT THE REFERENCE CASE
HTA faces an identity crisis because the reference case treats numbers as measures without first satisfying the axioms of representational measurement. It builds on utilities elicited from preference tasks and multiplies them by time to form QALYs, violating unidimensionality, additivity, invariance, and dimensional homogeneity. With a denominator that is not a measure, the cost-per-QALY ratio has no stable unit; its apparent precision is theatrical, not scientific. The reference case thus functions as a device for generating non-falsifiable claims: results track modeling conventions and tariffs, not an invariant ruler. Checklists reinforce the illusion by policing format rather than scale type, leaving HTA as ritual rather than science. Take away the reference case and the denial of RMT axioms and HTA has little if anything to offer if there is a commitment to the evolution of objective knowledge regarding therapy impact claims.
PROGRAM 2
A NEW START IN MEASUREMENT FOR HEALTH TECHNOLOGY ASSESSMENT
For fifty years, health technology assessment has practiced numerical storytelling by confusing numbers with measures. To function as a science, HTA must accept the axioms of representational measurement theory: following Stevens (1946), who tied arithmetic to scale type, and completed by Krantz, Luce, Suppes, and Tversky (1971) with representation and uniqueness theorems. In parallel, Rasch (1960) supplied the probabilistic bridge for latent traits; Wright (1977) showed how ordered responses can be transformed into a logit ruler with specific objectivity when the model fits. HTA could have adopted these foundations at any time; instead, fixation on QALYs and the valuation of multiattribute health-state descriptions, contrary to the requirement of unidimensionality, guaranteed measurement failure that persists to this day. The remedy is simple and non-negotiable: in HTA there are only two valid measures: linear ratio scales for manifest resource and utilization claims, and Rasch logit ratio scales for latent trait possession.
MODULE 1: THE DENIAL OF FALSIFICATION IN HEALTH TECHNOLOGY ASSESSMENT
Falsification demarcates science: claims must risk failure on a stable ruler. That requires constant units, disconfirmation conditions, and replicability. RMT supplies the prerequisite by ensuring order, additivity, and invariance so arithmetic is lawful. HTA denies falsification because its cornerstone quantities are not measures: utilities are just numbers yet treated as interval and multiplied by time into QALYs; the reference case embeds these non-measures in models and thresholds. Without a validated unit, nothing can disconfirm the story. Denying falsification means denying the standard for normal science and the evolution of objective knowledge.
MODULE 2: THE RASCH MODEL – LATENT TRAITS AND ITEM SELECTION
Latent traits are real only when they admit invariant, testable measurement. Rasch delivers this by specifying a single trait, testing items against it, and mapping responses via a logistic function of person location minus item difficulty to place persons and items on a logit continuum. When items are ranked by respondent difficulty, they must fit the axiomatic RMT requirements of the Rasch model; the gives constant relative differences and invariant comparisons. Item selection to reflect the latent trait is critical. Design targets items near 50% endorsement and spans the range. Misfit flags instrument problems; the fitting of items is a falsification check.
MODULE 3: THE RASCH MODEL – THE UNIQUE RASCH LOGIT RATIO SCALE
Building a Rasch interval scale enacts conjecture and refutation: infit, outfit, local independence, DIF, and invariance tests probe measurement axioms. Misfit invites falsification; surviving earns the name “measure.” Rasch tests existence: an attribute is measurable only when it yields an invariant scale across persons, items, and time: the unique Rasch logit ratio scale. Uniquely, the model provides a dual metric: additive logits as an interval ruler to define possession of a latent trait θ and multiplicative odds with a true zero, where e Δθ gives an invariant, item-free odds ratio.
MODULE 4: THE RASCH MODEL – POSSESSION AND FALSIFICATION
Possession, the amount of a single latent trait, is the primary measurement quantity (θ) and logits the legitimate scale. Rasch places persons and items on a log-odds continuum; when unidimensionality, ordered categories, local independence, invariance hold, equal logit differences have equal meaning. Estimation yields θ and β with standard errors that widen under poor targeting or extreme scores. Precision, coherence, and targeting govern interpretability. Group inference reports change and difference-in-differences on the logit scale, with odds-ratio translations via e Δθ . For subjective responses the Rasch model is the only framework for meeting the axioms of RMT. All other instruments are redundant.
MODULE 5: THE RASCH MODEL – THE EXISTENTIAL CRISIS FOR DISEASE SPECIFIC INSTRUMENTS
HTA faces an existential crisis in its disease-specific outcomes. Where subjective responses are involved, instruments must be built and verified to Rasch standards; summing item scores is not measurement. Without demonstrated unidimensionality, ordered thresholds, local independence, additivity, and invariance, a total is merely a series of numbers, not an interval measure. The Rasch model uniquely operationalizes the axioms of representational measurement for latent traits: when items survive fit tests, persons and items share a single logit ruler with specific objectivity. Yet hundreds of disease-specific questionnaires used to claim therapy impact have never passed this gate. Their totals shift with item mix, category use, and sample composition, so apparent “improvement” often reflects instrument behavior, not change in the underlying trait. In short, they are numerical storytelling.
