Here are the notes and slides for a recent presentation on strategies for effectively choosing diagnostic tests.
GOALS OF DIAGNOSTIC TESTING
Ultimately, the goal of any test we run should be obtaining information that allows us to more effectively treat or prevent health problems in our patients. This seems obvious, but it is all too easy to lose sight of this core purpose. We may feel obligated to run tests to confirm a diagnosis even when the level of confidence is already high and the outcome of the test won’t change what the client chooses to do. We may employ diagnostic tests as a preemptive defense against litigation or because of a perceived pressure from the client to do something even when our action likely won’t change the outcome for the patient. In some situations, we may be completely confused by a case and throw a bunch of tests at it hoping for some insight to emerge.
All of these are understandable, and all too common, reasons for using diagnostic tests, but unfortunately such approaches reduce the reliability and utility of the tests themselves. Effective testing requires not only an understanding of the strengths and weaknesses of the tests we use but also a clear understanding of how to employ them and how to integrate the results into our clinical decision making. We need a rational strategy for when and how to test, how to interpret results, and somewhat counterintuitively, when not to test at all.
BEYOND SENSITIVITY AND SPECIFICITY
The most common measures used to describe diagnostic test are sensitivity and specificity . These are characteristics of the tests themselves, and they indicate how likely, compared with some gold standard, a test is to correctly identify a disease which is present or to correctly identify that a patient does not have the disease. Unfortunately, the meaning of these numbers is often misunderstood. If a test has, for example, a 98% sensitivity, this is the proportion of patients with the disease who will correctly test positive. It is NOT an indication that any patient who tests positive has a 98% chance of having the disease. Under certain conditions, the majority of patients testing positive on such a test may actually not have the disease even with such a high sensitivity.
More clinically useful measures of a test’s reliability are the positive predictive value and the negative predictive value (Fig. 1). These are, respectively, the probability a patient with a positive test actually has the disease and the probability a patient with a negative test result does not have the disease. These numbers depend not only on the test but also how common the disease is in the population being tested.
As an example, if a population of feral cats has an FIV prevalence of 2%, 2/100 cats tested will test positive with a perfectly sensitive test (sensitivity=100%). If the test also has a specificity of 98%, then about 2/100 cats will test positive even though they do not have FIV. The positive predictive value, then, is 50%, meaning half of the cats who test positive do NOT have FIV. Even with a great test, this is a pretty big error rate, especially if we are planning on euthanizing cats diagnosed with FIV!
This example illustrates how important it is we have some idea how likely a disease is to be present before running a test for that disease if we want our test results to be reliable. Which brings us to a new and somewhat fashionable way to look at diagnostic testing….
BAYESIAN ANALYSIS FOR THE MATHEMATICALLY CHALLENGED
The work of 18th-century mathematician Thomas Bayes is enjoying something of a renaissance as an alternative, in some respects, to the frequentist statistical methods most of us were taught in vet school. The details of the math involved are complex, but the logic of the approach is simple and intuitive. Diagnostic tests should not be viewed as determining whether or not a disease is present. They should be viewed, instead, as one piece of evidence shifting the existing probability of a diagnosis higher or lower.
If, as in the example above, I know that the prevalence of FIV is 2% in this population of cats, I can say the probability of any given cat having FIV is very low. A positive test does not mean a cat has FIV, only that the probability it might have the disease has increased a bit. The test doesn’t make or break the diagnosis, it simply shifts out understanding of the likelihood of the diagnosis.
In a practical sense, then, a Bayesian approach means estimating the probability of a diagnosis based on all of the usual factors we consider (signalment, personal history, prevalence rates, physical exam findings, other test results, etc.). If this probability is high enough or low enough to make or rule out a diagnosis, no additional test is needed. If, however, the probability leaves significant uncertainty, then we should select a test that will meaningfully raise or lower that probability to help us make the diagnosis.
Screening is a special case in which we are testing asymptomatic individuals with the idea of detecting preclinical disease so we can more effectively intervene to reduce symptoms and mortality. Because the prior probability of disease is usually very low by definition in screening, since patients have no symptoms, the positive predictive value of even very good tests is low. It has been recognized in human medicine that screening can often lead to overdiagnosis and overtreatment, which can waste medical resources and ultimately do more harm than good for patients.1There are, therefore, requirements for screening programs, and these include not only accurate tests but proven interventions that actually improve outcomes for patients diagnosed with the disease and rational plans for confirming and following up both positive and negative test results.
In veterinary medicine, we often employ diagnostic tests in asymptomatic patients “just in case” we might find subclinical disease. Whether or not such testing improves outcomes for patients or leads to significant overdiagnosis is almost never evaluated, so the benefits and risks of screening are often assumed but not truly known. This means that significant caution is warranted in conducting screening and interpreting the results of diagnostic tests in clinically well individuals.
CARDINAL RULES OF DIAGNOSTIC TESTING
Based on this understanding of the limitations of diagnostic testing, there are a few cardinal rules we can apply to reduce the potential mistakes and harms resulting from our tests:
Cardinal Rule #1
If the results of the test isn’t going to change what you do, don’t run the test.
Cardinal Rule #2
If the prior probability of a diagnosis is very high or very low, don’t run the test.
Cardinal Rule #3
Don’t screen (test asymptomatic individuals) without a plan of action based on solid evidence that the benefits of testing and diagnosis outweigh the risks.
- McKenzie, BA. Overdiagnosis. J Amer Vet Med Assoc. 2016;249(8):884-889.