I recently ran across an article in the journal Evidence-Based Complementary and Alternative Medicine (eCAM) titled Evaluating Complementary Therapies for Canine Osteoarthritis–Part II: A homeopathic Combination Preparation (Zeel) (Hielm-Bjorkman, A et al 2009:6(4)465-471).
According to the authors, “a homeopathic combination product (HCP) for canine osteoarthritic pain was evaluated in a randomized, double-controlled and double-blinded clinical trial…[and] that the HCP (Zeel) was beneficial in alleviating chronic orthopedic pain in dogs, although it was not as effective as Carprofen.”
There are many levels on which any clinical research article should be critically evaluated. The potential biases of the authors and the journal, the quality of the methodology, the statistical analysis of the data, and the degree to which the conclusions follow from the data are all common criteria by which such publications can be judged. R. Barker Bausell in his book Snake Oil Science does an outstanding job illustrating some reasons why not everything that makes its way into a scientific journal is reliable science and why such critical evaluation is necessary.
In this case, the journal makes some effort to follow the principles of evidence-based medicine, but it is guilty of some serious Tooth Fairy Science, in which rigorous methodology is applied to some fundamentally irrational premises. Skimming through some archival issues also indicates a pretty strong preference for publishing positive findings for CAM interventions. None of this automatically invalidates anything published, but it is one factor to consider since the effect of personal as well as financial biases on research outcomes is well established.
As for the authors, I am not able to establish much about their biases. Two of the authors are professors at the University of Helsinki School of Veterinary Medicine, and the lead author’s research summary suggests a strong attachment to CAM. Her doctoral dissertation was a study on gold implantation, green-lipped mussel extract, and Zeel for use in canine osteoarthritis, and it appears she is publishing this thesis research as a series of articles in eCAM.
The methodology is generally sound, with a couple of exceptions. First, while the product studied is identified as homeopathic and an inject able version of it is listed in the US Homeopathic Pharmacopoeia, even the authors insert the caveat that “this is not a classical homeopathic treatment.” The preparation contains 14 listed ingredients, many of which are present after having been diluted 1:10 only 2-8 times, for “molar concentrations of 10-5 to 10-12mol/L.” Such concentrations are low, but still higher than the usual case for homeopathic preparations, which cannot conceivable contain any of the original ingredient. It is possible, then, that this substance could contain some pharmacologically active substances. The ingredients listed (same as the injectable product) are:
- Arnica montana, radix (mountain arnica)
Dulcamara (bittersweet)
Rhus toxicodendron (poison oak)
Sanguinaria canadensis (blood root)
Symphytum officinale (comfrey)
- Mineral ingredients:
Sulphur (sulphur)
(alpha)-Lipoicum acid (thioctic acid)
Coenzyme A (coenzyme A)
Nadidum (nicotinamide adenine dinucleotide)
Natrum oxalaceticum (sodium oxalacetate)
- Animal-derived ingredients
Cartilago suis (porcine cartilage)
Embryo totalis suis (porcine embryo)
Funiculus umbilicalis suis (porcine umbilical cord)
Placenta suis (porcine placenta)
The subjects were appropriately randomized into treatment, placebo, and positive control groups, with Carprofen as the positive control. The subjects in each group all appear to be comparable at baseline. The placebo group did have higher baseline scores on 5/7 measures of pain, but the authors state that no statistically significant differences were found between groups at this point.
The placebo control was not ideal. The treatment product was visibly different from the Carprofen and the placebo (which were identical to each other). The owners were given extra Carprofen in its original packaging for rescue, so clearly they would be able to identify the treatment product as different. In addition, all subjects also received an inert capsule as part of a separate study, so while the Zeel group received “an ampoule of clear liquid” once daily and “a slightly green (lactose) capsule,” the Carprofen and placebo groups received the green capsule and “a white pill” twice daily. It is not clear what if any affect such a discrepancy might have had on the subjective assessments of owners, or potentially blinded investigators who might have detected group assignment from comments made by owners.
Most of the assessment measures were subjective, such as owner rating scales or visual analog pain scores and investigator clinical exam assessment. Some force plate analysis was done, though this proved problematic. Two subjects had to have their force plate measurements discarded because they were too lame to allow accurate measurement. These subjects, however, were all in the placebo arm and so this would be expected to have the effect of decreasing perceived efficacy of the treatment.
The force plate measurements were “repeated until sufficient valid results were obtained for both left and right limbs.” It was not stated whether the number of trials needed to achieve this differed between groups, which could have affected the results is some subjects had to run back and forth significant more than others to get a valid reading, which might itself affect the reading.
The biggest methodological problem I see in the study is in order to calculate the percent of subjects improved or not improved in each group at the 8-week assessment period, “the results of each variable were converted into dichotomous responses of ‘improved’ or ‘not improved.'” Converting scale variables into dichotomous variables can exaggerate differences between groups. If the measurement was unchanged, the subject would be classified as “not improved,” but if there was even a miniscule change from baseline then the subject would fall into either “improved” or not “improved category.” Thus, subjects with dramatic improvements in scale measurements would be weighted the same as subjects with marginal, and likely clinically insignificant changes in the variable. Without the raw data, of course, it is impossible to tell what if any effect this procedure might have had on the final conclusions. However, the tabulation of the data presented in the article appears to show much greater improvement in terms of the percentage of subjects improved than in terms of the actual median improvement of the variables themselves, suggesting that in fact such an exaggeration occurred.
The authors also stated that “for dogs that had used extra Carprofen more than three times per week at W8 [4 dogs in the placebo arm] we changed all their variable values at evaluation W8 into the most negative value measured at that time.…to counteract the effect of the NSAID…” This seems a clear fudging of the data which made the placebo group appear to have worse outcome measures than it actually did. Certainly, it is possible that these dogs needed more Carprofen than the treatment group because the treatment was having a beneficial effect. But it is just as possible that the placebo group took more Carprofen because of differences in disease severity, in owner attitudes or behavior, or some other factor. And what makes the arbitrary designation of three times a week an appropriate justification for altering the data in this way is unclear. In any case, the effect of this decision is to make the outcomes appear worse for the placebo group, which clearly makes the treatment group outcomes appear relatively better.
Patients given Carprofen clearly showed improvement over baseline at a rate significantly higher than placebo. 67-86% of subjects were categorized as “improved” for the various outcome measures, and the actual values for each measure were improved from 2-5 times more than for the Zeel group. In the treatment group significantly more subjects were classed as “improved” compared to placebo in 3 out of 6 measures. Again, this is likely inflated by the conversion of scale data to dichotomous data. The Zeel group also showed significantly greater improvement than the placebo group in 4 out of 6 specific measures, though for one the P value was 0.049, quite close to the cutoff of 0.05.
The authors also state that use of supplemental or rescue Carprofen occurred in 14% of the Carprofen group, 28% of the Zeel group, and 8% of the placebo group. Though they claim that the only significant difference was between the Carprofen and placebo groups, this is puzzling, both because the Zeel group had a percent of rescue use dramatically higher than the other groups, and because of the earlier statement about manipulating the data for the placebo group to “counteract” the effect of Carprofen use for that group.
No differences in bloodwork values or clinical side effects were seen between groups.
The authors also make the unsupported statement that “it is generally accepted that seasonal differences influence OA, with patients being worse in cold, damp and unstable weather.” A number of studies have found this traditionally assumed relationship to be difficult to verify and likely a minor and insignificant factor in arthritis pain for most patients (1, 2, 3), so it does not qualify as “generally accepted.” Nevertheless, the authors go on to claim that a trend observed of worsening symptoms for the placebo group during the treatment phase of the study and subsequent improvement during the post-treatment follow-up was due to the weather, and that the opposite trend seen in the Zeel and Carprofen groups was due to the effects of the treatment agents. It seems more likely that the placebo group simply different in significant ways from the other groups, which casts further doubt on the conclusion that the test product was of meaningful benefit.
The authors conclude by putting the usual best possible spin on the weak results, suggesting that combined with in vitro results reported elsewhere they justify further research and pointing out that NSAIDS, which even they acknowledge are clearly superior for treatment of pain, have side effects, despite the fact that none were seen in this study. As I’ve said before, the resource limitations on research in veterinary medicine requires the most efficient use of those resources to maximize benefit, and such studies of implausible interventions are not going to benefit our patients. The authors clearly wish to find something positive in their results, but the study does not justify the commitment of more time and resources and talent to this methodology when better therapies are already available, and when decades of research on homeopathic preparations has failed to validate them. Such papers provide the aura of scientific legitimacy to such methods, but they are tooth fairy science, not evidence-based medicine, and they are a dead end we would do well to stop travelling down.