II. Lexapro Study MD-32 Was a “Positive” Clinical Trial for Adolescents, but Did Not Show a Meaningful Difference between Lexapro and Placebo
After Forest obtained the negative results of MD-15 in 2004 (in addition to Study 94404 and MD-18), Forest was concerned about conducting further pediatric trials. So, Forest obtained an agreement from the FDA, before the protocol for MD-32 was even written, that if Forest could obtain a positive result for adolescents in another Lexapro trial, the FDA would approve an adolescent indication. This agreement, however, was specifically contingent on MD-18 being considered a positive study. So, as one Forest executive put it, “everything hinges on SCT-32.”
To that end, Forest commenced MD-32 in April 2005, which was completed in May 2007. The trial evaluated 316 adolescents (only 259 completed the trial), between the ages of 12-17. In stark contrast to every other pediatric clinical trial of Celexa and Lexapro, MD-32 achieved statistical significance on both primary and secondary endpoints—although the observed cases analyses on the primary endpoint, i.e., those patients who completed the study, were negative.
MD-32 has several problems. First, the study was designed to detect a statistical significance, even with clinically insignificant differences between Lexapro and placebo. It is widely acknowledged that “[s]tatistically significant effects are not necessarily clinically meaningful effects.” This distinction between statistical and clinical significance exists because statistical significance is a species of statistics and clinical significance focuses on real-world effects. Clinical significance is defined as “the smallest difference (i.e., effect size) . . . that patients perceive as beneficial and that would mandate, in the absence of troublesome side effects and cost, a change in the patient’s management.” Thus, it is entirely possible to achieve a statistically significant result, even if the difference is trivial, by overpowering a study, i.e., increasing the sample size. As Dr. Laughren of the FDA acknowledged:
- And we discussed earlier that when you increase the sample size in a clinical trial, what would otherwise be statistically insignificant differences between the placebo arm and the drug arm can suddenly reach a statistically significant P-value, correct?
- There’s no question that . . . an increase in the sample size can in some settings — it doesn’t always, but it can reduce variance, and therefore, you know, increase the chance of getting a statistically significant P-value.
Efficacy in each of the pediatric clinical trials, including MD-32, was measured by comparing the level of depression, as established using a rating scale, at the beginning of the trial with the level of depression at the end of the trial. Then, Forest compared the results in the drug arm with the results in the placebo arm to see if there was any benefit from the drug beyond a placebo effect. In MD-32, both the Lexapro and placebo patients improved, but the difference in improvement between the Lexapro and placebo groups was only 3.4 points—out of a scale that goes up to 113 points. This means patients taking Lexapro improved an additional 3.4 points on the depression rating scale than patients taking a sugar pill. This difference of 3.4 points, while statistically significant, is so small that no patient or doctor would be able to tell the difference in real life.
There are also questions about whether MD-32 was properly conducted. When the patients were randomized into the study, the Lexapro group started with a baseline score that was statistically significantly higher than the placebo group, i.e., 1.6 points. This indicates there was selection bias (not true randomization in the Lexapro and placebo groups). On average, patients in the Lexapro group were 1.6 points worse than the placebo patients, which means there was more “room” for improvement.
Forest claims that a difference of 1.6 at baseline is not clinically significant, so it does not affect the study. However, if so, then a difference of 3.4 at the end of the study is also not clinically significant. One physician who peer reviewed the study manuscript commented: “It was not clear why the authors consider the baseline difference in the CDRS-R (~2 points) between the two treatment groups as not clinically significant even though it was statistically significant. This is confusing as the authors’ then note that a CDRS-R treatment difference between the groups of ~2pts, which is statistically significant, shows efficacy.”
Finally, even if one disregards the methodological problems with MD-32, the results hardly provide substantial evidence of efficacy. There are two primary ways to quantify clinical significance. The first is called the Cohen effect size. “While Cohen defined large, medium, and small effects as d=0.8, 0.5, and 0.2, respectively, an FDA rule of thumb is that an effect is deemed large if it is >0.8, small if it is <0.5, and moderate if it falls between those values.” The second is known as the number needed to treat (“NNT”). The NNT reflects the number of people who need to be treated with the drug before one additional person improves more than taking a placebo. “[T]he NNT is a meaningful, well-accepted, common-sense measure[.]” Id. On the NNT scale, if the number is less than 2, then the drug is considered highly effective. Id. If the NNT is greater than 4, then it is less effective, since one would need greater numbers of patients taking the drug before a person fared better than placebo.
In MD-32, the effect size was 0.27 and the NNT was 8.75. These are, by all objective measures, appalling numbers. This prompted one researcher reviewing Study MD-32 “to wonder whether the restrictive entry criteria in conjunction with the small effect size limit the utility of escitalopram in the real world of adolescent MDD. Are these results statistically significant but clinically not meaningful?” And another stated “this is a relatively small ES [effect size]. Given this small ES, there were no data to see if this level of change had any quality of life meaning.” Considering this is the only statistically positive study for Celexa or Lexapro, obtained under questionable circumstances, and was limited to adolescents, the results are small” and unreliable. Standing alone, MD-32 provides scant, if any, evidence of true efficacy.
Part I | Part II | Part III | Part IV | Part V | Part VI