A visitor's comment from one of my previous posts reminded me of an issue I'd thought about before.
In mental health research, symptom scales are often used to measure therapeutic improvement. In depression, the most common scales are the Hamilton Depression Rating Scale (HDRS), the Montgomery-Ashberg Depression Rating Scale (MADRS), or sometimes the Beck Depression Inventory (BDI). The first two examples involve an interviewer assigning a score to a variety of different symptoms or signs. The last example is a scale which is filled out by a patient.
Here are examples of questions from the HDRS, with associated ranges of scoring:
depressed mood (0-4); decreased work & activities (0-4); social withdrawal (0-4); sexual symptoms (0-2); GI symptoms (0-2); weight loss (0-2); weight gain (0-2); appetite increase (0-3); increased eating (0-3); carbohydrate craving (0-3); insomnia (0-6); hypersomnia (0-4); general somatic symptoms (0-2); fatigue (0-4); guilt (0-4); suicidal thoughts/behaviours (0-4); psychological manifestations of anxiety (0-4); somatic manifestations of anxiety (0-4); hypochondriasis (0-4); insight (0-2); motor slowing (0-4); agitation (0-4); diurnal variation (0-2); reverse diurnal variation (0-3); depersonalization (0-4); paranoia (0-3); OCD symptoms (0-2)
One can see from this list that depressive syndromes which have many physical manifestations will obviously score much higher. The highest possible score on the 29-item HDRS is 89. It is likely that physical manifestations of acute depression resolve more quickly, particularly in response to medications. Therefore, the finding that more severe depressions have better response to medication could be simply an artifact of the fact that physical symptoms respond better and more quickly to physical treatments.
A person who is eating and sleeping poorly, is tired, feels and looks physically ill, who is not working, who is not seeing friends as much, and whose symptoms fluctuate in the day, would already get an HDRS score of up to 30 -- without actually feeling depressed or anxious at all! A person feeling very depressed, struggling through life with little pleasure, meaning, satisfaction, or joy -- but sleeping ok, eating ok, and forcing self through daily routines such as work, social relationships, etc. -- might only get a score of 4-6 on this scale.
I acknowledge that the many questions on the HDRS cover a variety of important symptom areas, and improvement in any one of these domains can be very significant.
But -- a big problem of the scale, for me, is that the relative significance of the different symptoms is arbitrarily fixed by the structure of the questionnaire. So, for example, are the 4 points for fatigue of equivalent importance to the 4 points for guilt, or social withdrawal, or depressed mood? Would different individuals rate the relative importance of these symptoms differently? Maybe some people might prefer to sleep better, rather than socialize with greater ease. Also, perhaps some of the symptom questions deserve to be "non-linear," or context-dependent. So, for example, perhaps mild or intermittent depressed mood might deserve a score of only "1". Moderately depressed mood might warrant a score of "5". Severe depressive mood might warrant a score of "20". Or, relentless moderate symptoms over a period of years might warrant a score of "20", while only short-term or episodic moderate symptoms might warrant a score of "5".
It would be interesting to change the weighting of these symptom scores, on an individualized basis.
Also, it would be interesting to see the results of depression treatment studies portrayed with all the separate symptom categories broken down (i.e. to see how the treatment changed each item on the HDRS). Many researchers or statisticians would complain that to portray, or make conclusions, about so many results at once, would reduce the statistical significance. Statistically, a so-called "Bonferroni correction" is necessary if multiple hypotheses are being made simultaneously: if n hypotheses are made, the statistical significance is reduced by a factor of 1/n. Based on this statistical idea, most researchers prefer to analyze just a single quantity, such as the HDRS score, instead of looking at each component of the score separately.
But, this analysis dilutes the data from any study, in the same way that the analysis of artworks in a museum would be diluted if each piece were summarized only by its mass or area.
A more complete analysis would portray every category at once. A graphical presentation would be reasonable, perhaps taking the form of a 3-d surface (once again). The x-axis could represent the different symptom areas (or scores on each item on the HDRS); the y-axis could represent time; and the z-axis could represent the severity. With this analysis, we could say that we are not actually making n hypotheses--we are making a single hypothesis, that the multifactorial pattern of symptom results, manifest as a 3-d surface, is changing over time. Each individual patient's symptom changes, in every symptom category, could be represented on the graph. In this way, no data, or analytic possibility, would be lost or diluted. The reader would be able to inspect every part of the data from the study, and perhaps notice interesting relationships which the original researchers had not considered.
Some patterns of change with different treatment could present in the following ways, as shown in such as 3-d surface:
1) some symptoms improve dramatically with time, while others are much slower to change, or don't change at all. In depression treatment studies, sleep or appetite might change very quickly with a potent antihistaminic drug...this would immediately lead to pronounced improvement on the overall HDRS score, but might not be associated with any significant improvement in mood, energy, concentration, etc.
2) some symptoms might improve immediately, but deteriorate right back to baseline or worse after a few weeks or months. Benzodiazepine treatment would produce such as pattern, in terms of sleep or anxiety improvement. A medication which is sedating but addictive might cause rapid HDRS improvement, but only a careful look at individual category changes over a long period of time would allow us to see the addiction/tolerance pattern. Some people drink alcohol to treat their anxiety symptoms -- such a behaviour might rapidly improve their HDRS scores! But of course, the scores would return to worse than baseline within a few weeks or months. And the person would probably have new symptoms and problems on top of their original ones. So, we must be cautious about getting too excited about claims of rapid HDRS change!
3) some treatments might cause a global change in most or all symptoms...this would be the goal of most treatment strategies. Such a pattern would imply that the multi-symptom syndrome (in this case, the "major depressive disorder" construct) is in fact valid, all components of which improving together with a single treatment.
4) some combined treatments might work well together...for example, a treatment which helps substantially with energy or concentration (such as a stimulant), together with a treatment which helps with mood, socialization, optimism, or anxiety (such as psychotherapy, or an antidepressant). These treatments on their own might appear to be equivalent if only the total HDRS score is considered (since each would reduce symptom points overall); the synergistic effect would only be apparent by looking at each symptom domain separately.
Finally, I think it is important to look at very broad, simple indicators of quality of life, or of general improvement. The "CGI" scale is one example, although it is awkward and imprecise in design, and most likely prone to bias.
Quality of life scales are important as well, in my opinion, since they look at overall satisfaction with life, rather than merely a collection of symptoms.
In practice, only a discussion with the person receiving the treatment can really assess whether it is worthwhile to continue the treatment or not. In such a discussion, the subjective pros and cons of the treatment can be weighed. Even if the treatment has had a minimal impact on a rating score, it might be subjectively beneficial to the person receiving it. And even if the treatment has produced large rating score changes, it might not be the person's preference to continue. I suppose the role of a prescriber is mainly to facilitate such a dialog, and contradict the patient's wishes only if the treatment is objectively causing harm.