Distinguishing bipolar and major depressive disorders by brain structural morphometry: a pilot study


Participants

A total of 16 patients diagnosed with BD (mixed types of BD-I, BD-IIs, and BD-NOS;
6 males, mean age?=?26.3 years, SD?=?7.9 years) and 19 patients diagnosed with MDD (8 males, mean age?=?30.0 years,
SD?=?8.9 years) were recruited from local hospitals in Beijing, China (Institute of
Mental Health, The Third Affiliated Hospital of Beijing Chinese Medicine University,
and Beijing Hui-Long-Guan Hospital). Patients were diagnosed by experienced psychiatrists
(MQ, YTM, ZRW) according to DSM-IV criteria 32]. For comparison purpose, a group of 29 healthy controls (HC, 11 males, mean age?=?27.1 years,
SD?=?8.4 years) were also recruited from communities in Beijng. They were demographically
(handedness, age, gender, educational level and IQ) well matched (see Table 1). All participants in this study were right-handed, as measured by the Annett Handedness
Scale 33]. Their IQ scores were estimated using the short form of the Chinese version of the
Wechsler Adult Intelligence Scale-Revised (WAIS-R) 34]. The IQ estimates and handedness were assessed by trained researchers (QZ, ZL). Exclusion
criteria for both clinical groups included patients with psychiatric comorbidities
18]; a history of psychotic symptoms; a history of neurological disorder; a lifetime
prevalence of substance abuse; an IQ estimate lower than 70; and not being able to
be scanned by MRI. Exclusion criteria for healthy controls were similar to the clinical
groups, with the addition of no family history of neuropsychiatric or neurological
disorders. This study received ethical approvals from the three local hospitals (Institute
of Mental Health; The Third Affiliated Hospital of Beijing Chinese Medicine University;
Beijing Hui-Long-Guan Hospital) and the Institute of Psychology, Chinese Academy of
Sciences. Written informed consent was obtained from each participant before the study.

Table 1. Demographic and clinical characteristics

Medication and clinical characteristics

Medication histories of the two clinical groups are summarized in Table 1. The doses of antipsychotics in medication were transformed into Chlorpromazine equivalent
(CPZ) 35], 36]. In the MDD group, eight of 19 patients (42.1 %) were medication-free at the time
of assessment. The remaining 11 patients were taking antidepressants, including Citalopram,
Mirtazapine, Venlafaxine, Sertraline, Paroxetine, Fluoxetine, and Duloxetine hydrochloride.
Among the 11 medicated patients, two of them were also taking antipsychotics (Flupentixol
dihydrochloride, CPZ?=?10 mg; and Aripiprazole, CPZ?=?67 mg); three of them were taking
anti-anxiety medicines (including Clonazepam, Buspirone, and Lorazepam). In the BD
group, five of 16 patients (31.3 %) were medication-free at the time of assessment.
Among the 11 medicated patients, five of them were taking antidepressants (including
Venlafaxine, Sertraline, Fluoxetine, Fluvoxamine, Duloxetine hydrochloride); eight
of them were taking atypical antipsychotics (including Olanzapine, Quetiapine, Risperidone,
Aripiprazole, mean CPZ?=?226.2 mg, SD?=?327.4 mg); one of them was taking Trihexyphenidyl; five of them were taking mood
stabilizers (Lithium and Valproate); and two of them were taking anti-anxiety medicines
(Clonazepam and Lorazepam).

No group difference was found for duration of illness (p 0.05) between patients with BD and MDD. In both clinical groups, major depressive
symptoms and manic symptoms were assessed by experienced psychiatrists (MQ, YTM, ZRW)
using the 17-item Hamilton Rating Scale for Depression (HAMD) 37] and the Young Mania Rating Scale (YMRS) 38]. No difference was found between patients with BD and MDD in the 17-item HAMD total
score (t?=?0.77, p?=?0.45). However, as expected, lower YMRS total score (t?=?-2.35, p?=?0.03) was observed in the MDD group than the BD group.

MRI acquisitions and preprocessing

High-resolution T1-weighted images from all participants were acquired on a 3-Tesla
scanner (Siemens 3 T-Trio A Tim, Erlangen, Germany) at the MRI Center of Beijing 306
Hospital. Before image collection, a pre-scan lasting one minute and 10 s was taken
and inspected by clinical radiologists to exclude individuals with structural brain
abnormalities. The scanning parameters of the T1-weighted images were as follows:
slice thickness?=?1 mm, TE?=?3.01 ms, TR?=?2300 ms, 176 slices in sagittal plane,
field of view (FOV)?=?256 mm, voxel size?=?1 x 1 x 1 mm
3
, bandwidth?=?240 Hz/pixel, duration?=?6 min 56 s. All participants were asked to
close their eyes and remain motionless during data collection.

Cortical reconstruction and subcortical volumetric segmentation were performed using
the FreeSurfer imaging analysis suite (v5.1.0, http://surfer.nmr.mgh.harvard.edu/) 39]. Details of this pipeline are fully described on its webpage. Briefly, the T1-weighted
images were firstly registered to the Talairach space of each participant’s brain
with the skulls stripped. Images were then segmented into white/grey matter (WM/GM)
tissues based on local and neighbouring intensities. The cortical surface of each
hemisphere was inflated to an average spherical surface to locate both the grey matter
(pial) surface and the WM/GM boundary. Preprocessed images were visually inspected
(by GF and YD) to ensure the reconstruction and segmentation qualities. Any topological
defects were excluded from the subsequent analyses but no data had to be excluded
at this point. At the cortical level, cortical thickness was measured as the shortest
distance between the pial surface and the GM/WM boundary at each point across the
cortical mantle. Surface area was measured as the area of a vertex on the pial surface,
calculated as the average of the tessellation areas touching that vertex. In addition,
the cerebral cortex of each participant was automatically parcellated into 70 regions
according to the Desikan-Killiany cortical atlas 40], with their mean cortical thickness and surface area calculated for the ROI analysis.
Before group-level statistical analyses, individual cortical surface maps were smoothed
with a Gaussian 25 mm full-width-at-half-maximum (FWHM) kernel when accounting for
the sample size. At the subcortical level, volumes of a series of subcortical structures
were extracted using the automated segmentation function in FreeSurfer 41] for the ROI analysis. Total intracranial volume (ICV) of each participant was also
extracted.

Vertex-wise group comparison on cortical measures

Whole brain analyses of cortical thickness and surface area were performed pairwise
between each two groups of participants (viz., BD vs MDD, HC vs BD, HC vs MDD) using
general linear models (GLM) in FreeSurfer’s QDEC (Query, Design, Estimate, Contrast)
operation, after co-varying for ICV, age and IQ estimates. In an exploratory manner,
the significant threshold was set at p??0.01 uncorrected (two-tailed). To minimize Type I error, only clusters with significant
number of vertices greater than 200 were reported 42]. Significant clusters were mapped to the Desikan-Killiany cortical atlas 40] based on the structures of gyrus and sulcus. Group difference maps were constructed
in QDEC based on –log
10
(p-value).

Region-of-Interest (ROI) analyses

ROI analyses were performed between the BD group and the MDD group. At the cortical
level, significant clusters from the BD-MDD whole brain comparison would be defined
as ROI(s). At the subcortical level, bilateral thalamus, caudate nucleus, putamen,
hippocampus, amygdala and nucleus accumbens were defined as ROIs. To control for individual
variations of ICV, age and IQ estimate, a ratio of each ROI was calculated [ratio?=?ROI
mean value/(ICV * age * IQ estimate)]. Two sample t-tests were performed on the ROI ratios, with the false discovery rate (FDR) corrected
for multiple comparisons.

SVM classification

Using the ROI morphological features as input data, an exploratory SVM was applied
to classify patients with BD and patients with MDD. SVM is a supervised multivariate
classification algorithm based on pattern recognition. In brief, it separates the
input data into different classes (i.e., patients with BD and MDD) by identifying
an optimal separating hyperplane (OSH) or named decision boundary. Initially the algorithm
is trained on a subset of training data to find a hyperplane that best separates the
input data space according to their known class labels (i.e., patients’ group memberships,
-1?=?MDD and 1?=?BD). After the hyperplane is built by a set of support vectors, a
subset of test data are then classified using the predicted values to determine which
side of the hyperplane they should locate. Given that the input data of the present
study are already dimensionally reduced ROI features, a non-linear algorithm was chosen
with a radial basis function kernel. Parameters C and gamma, which controls a tradeoff
between allowed training errors and misclassifications, and the width of the radial
basis function, were tuned using a 10-fold cross-validation approach. The optimized
parameters that provide the best accuracy would be selected for the final model.

In the present study, the classifier’s performance is evaluated using the common leave-one-out-each-group
cross-validation approach. This validation procedure provides robust parameter estimates
particularly for smaller samples 22]. In each trial observation, one patient per group was left out from the data to train
the classifier, but then used to determine the detection rate of this trained classifier
(testing). The procedure was repeated until every participant had been used for testing
a classifier. The overall accuracy of the classifier was the averaged detection rate.
The sensitivity and specificity of the classifier were also quantified. Specifically,
sensitivity was calculated by the number of true BD dividing by the total number of
true BD and those misclassified BD as MDD. Specificity was calculated by the number
of true MDD dividing by the total number of true MDD and those misclassified MDD as
BD. To evaluate the probability of obtaining the overall accuracy by chance, statistical
significance was verified by means of permutation tests 24]. We randomly assigned a class label to each patient and repeated the same cross-validation
procedures for 1000 times. Then we counted the total number of times that the detection
rates from the permutation tests were higher than or equal to the actual value obtained
from the real test. A p-value for classification is derived from dividing this number by 1000. The classifications
were performed using R version 2.15.3 (R Development Core Team 2013. The R Foundation
for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org), with packages “bootstrap”, “class”, and “e1071” implemented 43].