Community-based mental health treatments for survivors of torture and militant attacks in Southern Iraq: a randomized control trial


Ethical statement

Institutional review boards at the Johns Hopkins Bloomberg School of Public Health,
and the Ministry of Health in Iraq’s Psychiatric Research Ethics Committee approved
the protocol. Study participants provided oral informed consent, and none received
compensation.

Trial design

This was a parallel, two-site, two-arm (1:1 allocation), single-blinded, wait-list
randomized controlled trial. It was single-blinded: interviewers at baseline and follow-up
did not know to which study arm the interviewees belonged. This RCT compared a transdiagnostic
counseling intervention (CETA) in one site, and CPT in a second site, with separate
WLCs in both sites. CETA was provided by 12 non-specialized health workers (eight
males, four females), called community mental health workers (CMHWs), working in and
around the cities of Karbala, Najaf and Hilla, south of Baghdad. CPT was provided
by 17 CMHWs working in and around the cities of Basra and Nassariyah in the far south
of Iraq. CMHWs are non-mental health professionals who are trained to provide mental
health services locally (i.e., the mental health equivalent of community health workers
or CHWs). In this trial, the CMHWs were medics or nurses who worked in rural Ministry
of Health primary health care centers. The CMHWs had received training in non-specific
counseling methods some years before by our partner international non-governmental
organization (Heartland Alliance International) and continued to provide these services
part-time.

Changes to original trial design

The trial was implemented as parallel two-arm studies as planned: two intervention
arms (CPT 26] and CETA 16]) were each compared separately to a WLC arm. The two parallel studies were carried
out in separate areas of Southern Iraq. The original plan for analysis was to lower
the sample size requirement in each area by combining the WLC participants to comprise
the comparison group for both of the intervention arms, based on reports by our partners
in Iraq that populations in the two areas were similar. Over the period of the trial,
it became clear that the two areas were not similar, with the CETA location experiencing
higher levels of ongoing insecurity (primarily bombings). For this reason, although
the three arms were implemented as designed, the controls were not combined in the
analysis reported here: the CETA intervention participants are compared only to the
controls from the same region (Karbala, Najaf), and the CPT intervention participants
are compared to controls from the same Basra/Nassariyah area. Evidence for the differences
between the two study sites is provided in the Results, with implications for the
study from not combining the controls detailed in the Discussion.

Study objectives

We carried out a rapid qualitative study before the trial using procedures described
elsewhere 27], 28] to: (1) identify important problems affecting survivors of systematic violence, as
perceived by survivors living in the study area; and (2) identify important tasks
of men and women. In free list and key informant interviews on local problems, the
most frequently mentioned mental health problems were fear, sadness and depression,
anxiousness, fear of police, tenseness (easily provoked), forgetfulness, losing trust
in others, and inability to sleep 29]. Because many of the responses referred to trauma-related symptoms, we decided that
trauma symptoms would be the primary study outcomes, and dysfunction would be the
secondary study outcome. In free lists and focus groups, we asked men about the important
tasks that men do to support themselves, their family, and community; and asked the
same questions of women about women. This information formed the basis of measures
of dysfunction, while the qualitative data on mental health problems were used to
select and adapt standard mental health instruments for local use as described below.

Study instrument

The study instrument included the symptom section of the Harvard Trauma Questionnaire
(HTQ) to assess trauma symptoms, the Hopkins Symptom Checklist for Depression and
Anxiety (HSCL-25), and a separate section containing 20 frequently mentioned mental
health symptoms from the qualitative study (described above) not already included
in the HTQ or HSCL-25 30]–32]. Possible responses for the HTQ and HSCL-25 were how often participants experienced
each symptom in the prior 2 weeks using an ordinal scale of 0 (never) to 3 (very often—i.e.,
five or more times per week). For example, participants were asked how often in the
last 2 weeks they were ‘feeling depressed’, and the possible responses were: Never
or No – score of 0; Sometimes (1–2 times a week – score of 1; Often (3–5 times a week)
– score of 2; and, Very Often (more than 5 times per week) – score of 3.

During translation, one HSCL item (feeling hopeless about the future) and two HTQ
items (feeling as if you don’t have a future; hopelessness) were very similar in local
Arabic. We therefore included only one question on hopelessness but used it in both
our trauma symptom and depression symptom scales. The final instrument included 25
HSCL symptoms and 29 HTQ symptoms.

The study instrument included locally developed dysfunction scales for men and women
using a process described in 33]. These scales were derived from data from the qualitative study based on locally-described
roles of men and women. Participants were asked how difficult it was for them to do
each task in the prior 2 weeks on an ordinal scale of 0 (no difficulty) to 4 (unable
to do the task). For example, men were asked how difficult it was for them to communicate
or socialize. Women were asked how difficult it was to raise their children. In the
final instrument, there were 21 items on the male dysfunction scale and 21 items on
the female dysfunction scale.

Prior to the RCT, we tested the study instrument’s reliability and criterion validity
among 149 survivors of systematic violence (80 men, 69 women) using a process described
elsewhere 33], 34]. The re-interviews to assess test-retest reliability were carried out within 3 weeks
of the first interview (the average time was seven days after the first interview).
The time between the interview and re-interview was longer than usual because of disruptions
caused by insecurity and holidays (Ramadan). Cronbach’s alpha scores were all greater
than 0.90 indicating adequate internal reliability 35]. Pearson correlation coefficients for combined inter-rater and test-retest reliability
(repeat by different interviewer) were all greater than 0.79 (range 0.799 to 0.961)
suggesting good inter-rater and test-retest reliability. Criterion validity was explored
by comparing the mean total scale scores of individuals diagnosed with anxiety, depression,
and/or PTS respectively by a local psychiatrist to those of individuals said by a
local psychiatrist to not to have any of these problems. The difference in mean total
scale scores between those diagnosed by the psychiatrist with and without a condition
was 14.5 (range of individual scores 0–72), 6 (range 0–51), and 25.5 (range 0–108)
for depression, anxiety and trauma, respectively; all were statistically significant
(p??.05). Among men, the difference was 20, 6, and 28 for depression, anxiety and trauma,
respectively, and all differences were statistically significant (p??.05). Among women, the median difference was 4.5 (p?=?.488), 7 (p?=?.076), and 26.5 (p??.05), indicating that the scale may not adequately discriminate women with depression
from those without. Overall, we concluded that criterion validity was supported for
all scales except for depression among women.

Based on item analysis, the trauma symptoms on the locally validated HTQ, along with
several additional local trauma symptoms (e.g., feeling that one is being watched),
were used to create a trauma scale for determining eligibility for the study. A trauma
scale score of 36 (the sum of each item in the scale with a maximum possible score
of 105) was the symptom criterion for eligibility. We selected this cut-off score
because it maximized the sensitivity and specificity based on a receiver operating
characteristic curve (ROC) analysis of validity study data. The ROC analysis was based
on diagnosis of PTS by the psychiatrists (dichotomous variable) and the HTQ scale
score (continuous variable). Area under the curve for this analysis was 0.75 (significance?=?0.000)
suggesting a level of accuracy that was fair. Lending equal weight to sensitivity
and specificity, 35.5 was the score at which sensitivity and 1-specificity were maximized,
which we rounded up to 36.

The study instrument was translated into Iraqi Arabic using words and phrases identified
during the qualitative study for symptoms. The study instrument was then back-translated
to English by another translator to check for accuracy of the translation.

Study participants

Participants were survivors of systematic violence referred to the CMHWs by physicians
in the health center where they worked, from local prisoners’ associations, and through
self-referral after learning of services through public service announcements or by
word of mouth. Survivors were defined as persons having experienced or witnessed physical
torture or militant attacks. A screening instrument was used by the CMHW both to determine
a client’s eligibility for the trial and, if recruited, as their baseline assessment.
The screening instrument was the same instrument used to measure the severity of symptoms
experienced by participants (the dependent variable of the study). We also used the
instrument to screen for eligibility for the study. The instrument had a section on
dysfunction, a section on depression and anxiety symptoms, a section on trauma symptoms,
a section on problems of torture survivors (identified during a qualitative study
before the trial), and a section with demographic questions. A score of 36 or higher
on the 29-question trauma section was the cutoff used for study eligibility based
on our finding in the earlier validity study that this cutoff was optimal for discriminating
those individuals diagnosed with PTSD from those without PTSD. A survivor who was
18 years of age or older and who met the symptom criterion was eligible for the trial.

Exclusion criteria included clients identified by the CMHWs as currently being psychotic
and/or those who were a danger to themselves or to others. In these cases, the supervisor
(a psychiatrist) was called immediately to talk to the client for possible referral
to a clinic or hospital.

Study setting

The study took place in the areas surrounding the cities of Karbala, Najaf and Hilla
(CETA), and around Basra/Nassariyah (CPT) in Southern Iraq. The treatment was provided
in Ministry of Health primary health care centers unless there was insufficient privacy
or the client found it difficult to travel. In these situations, another mutually
convenient and private place was chosen (e.g., client’s home).

Interventions

Waitlist control

Waitlist control participants received monthly telephone calls from the CMHWs who
enrolled them into the study to assess their safety and whether they needed referral
to psychiatric care (i.e. were a danger to self or others or presented with psychosis).
A safety monitoring form was used to screen for the need for referral 36]. CMHWs were instructed to check in with WLC participants but not provide any treatment.
After completing their control period and second assessment, controls were retired
from the trial and offered CETA or CPT.

Intervention: Common Elements Treatment Approach (CETA)

CETA, a transdiagnostic intervention developed by authors LM and SD, includes the
following possible components: 1) encouraging participation and psychoeducation, 2)
relaxation, 3) behavioral activation, 4) cognitive coping and restructuring, 5) imaginal
exposure, 6) in vivo exposure, 7) safety, and 8) finishing/wrap up 16]. CMHWs were taught all components, as well as how to make decisions about selection,
sequencing, and dosing (i.e. tailoring to the individual participant) based on three
sources of information: 1) results from certain items on the validated study instrument,
2) client observations and statements in the assessment and early sessions, and 3)
discussion with their supervisor, who in turn discussed the information with a CETA
trainer 16]. CETA was designed to include approximately 8–12 weekly individual sessions of 50–60
min in length. Results from a recently completed randomized trial testing CETA with
displaced Burmese on the Thai-Myanmar border showed significant reductions in depression,
posttraumatic stress, dysfunction, anxiety symptoms, and aggression 37].

CETA training and supervision followed the Apprenticeship Model (see 38] for details). Briefly, CMHWs received a10-day training in CETA, and then subsequently
participated in small practice groups led by two local supervisors (both psychiatrists)
and completed one pilot CETA case. Throughout the trial, CMHWs participated in weekly
group supervision led by local supervisors. CETA trainers, based in the United States,
conducted weekly Skype calls with local supervisors to review each case and provide
redirection when needed to ensure fidelity. Cultural adaptation of CETA was carried
out collaboratively by the local team and US-based experts prior to and during the
training process 39]. Fidelity was tracked by CMHW self-report of elements delivered, supervisor review
of notes and CMHW reports, and finally by trainer review.

Intervention: Cognitive Processing Therapy (CPT)

Cognitive Processing Therapy (CPT) is an evidenced-based cognitive behavioral psychotherapy
originally developed for treatment of PTS or PTS with comorbid depression 40], 41]. CPT combines cognitive restructuring (i.e., techniques aimed at changing extreme
and/or exaggerated beliefs to be more balanced and/or realistic) with emotional processing
of trauma-related content (i.e., techniques to enable clients to remember and experience
the full range of emotions about their trauma). The therapy has been highly effective
at reducing symptoms of PTS, depression, and anxiety across several RCTs and efficacy
studies across a range of trauma exposed populations including sexual assault, child
sexual abuse, domestic violence, and combat 25], 41]–45]. CPT has been evaluated for use with Bosnian refugees within the United States, the
majority of whom were exposed to torture, with effect sizes equal to those in the
randomized clinical trials 46], 47]. In addition, CPT was highly effective at reducing symptoms of PTS, depression, and
anxiety as well as decreasing dysfunction in a RCT in the Democratic Republic of Congo,
a high conflict setting with low resources 48]. Given these findings, CPT appeared to be another good option to test for survivors
of torture and other systematic violence in Southern Iraq.

The CPT intervention was provided using an apprenticeship model for training and supervision.
The CMHWs received seven days of in-person training with expert US-based CPT trainers
(DLK, KPL) based on a manual that was translated and adapted for the Southern Iraq
context. Ongoing supervision was provided through a multi-tiered supervision structure:
An Iraqi psychiatrist and cognitive psychologist provided direct supervision through
phone or in person meetings with the CMHWs; a bilingual US-trained physician trained
in CPT (GZ) provided telephone and Skype oversight and supervision to the supervisors;
and this physician communicated with the US-based experts (DLK, KPL) through weekly
calls for additional support and quality assurance. Cultural adaptations, described
elsewhere, were made to the standard CPT treatment so as to accommodate cultural differences,
better meet the needs of clients with lower levels of education, and to be easier
for therapists with less training in mental health interventions to administer 26]. Participants in the intervention group attended individual therapy sessions with
CMHWs. Therapy was 12 sessions, usually 1 week apart.

Outcomes

The primary outcome was trauma symptoms, assessed by a trauma scale score representing
the mean of the scores given to responses on the locally-validated HTQ. The secondary
outcome was dysfunction, assessed by mean item scores for the gender-specific items
on the locally-developed dysfunction scale. Anxiety and depression were assessed using
the mean item score on the locally-validated HSCL-25. None of the local items derived
from the qualitative study are included in the outcome scores for trauma, depression
or anxiety; the local items were used solely to screen clients for eligibility into
the study.

Sample size

Our sample size calculation of N?=?150 per arm provides 80 % power to detect a moderate effect size of 0.50 (Cohen’s
d), with an estimated loss of 25 % due to the authors’ experience with dropout in
similar settings, the additional expected dropout due to insecurity, and a moderate
design effect of 1.5 given authors’ experience and a lack of other studies in the
region.

Randomization

A randomization list was generated separately for each CMHW by study investigators.
This list included 20 sequential participant identification numbers. The assignment
was generated using a random number generator in Excel, with a 2 to 1 probability
of assignment to the intervention vs. the waitlist. A piece of paper indicating the
treatment assignment (intervention or waitlist) was stapled directly to the back of
the study consent forms that were pre-numbered with the participant identification
number. This paper could only be read if removed from the consent form.

Each potential study participant presenting to the CMHW with a request for mental
health services was interviewed using the study instrument. After identifying a client
as eligible for the study, and after obtaining their informed consent to participate,
the CMHW detached the study assignment paper stapled to the consent form. The study
investigators and supervisors maintained a master list for each CMHW that indicated
the sequence and appropriate treatment status (intervention/WLC) for each participant
to enable checking fidelity to the randomization model.

To avoid a difference between intervention and WLC participants in the time between
baseline and follow-up assessments, we matched controls with an intervention participant
who was enrolled into the study about the same time (within a few days to a week).
When an intervention participant—with an identified control match—finished therapy,
we arranged to interview both as close together in time as possible. The matching
was done after the trial began but before any follow-up interviews were carried out.

Blinding

Baseline assessments were conducted by CMHWs as part of the recruitment process prior
to randomization and who were therefore blind to the assignment of study participants
to intervention or WLC. These CMHWs treated those persons they had recruited who were
randomly assigned to treatment. Therefore, to maintain blinding, follow-up interviews
were done by a different CMHW than the one who recruited the participant so they were
unaware of the participant’s assignment. The supervisors and the study participants
were not blind to the treatment condition.

Statistical methods

All analyses were conducted using Stata 12 49]. Multiple imputation techniques were used to account for missingness at the item
level and the participant level. Missing data, including information about participants
who were lost to follow-up, were imputed using STATA’s chained equations command for
multiple imputation (MI) using Rubin’s rules for pooling data 50], 51]. Missing at random (MAR) was assumed for the imputation model due to the low rate
of missing follow up interviews. There were two clients in the CPT trial with more
than 40 % of the items missing in their baseline assessment of anxiety. There were
also two clients in the CPT trial with more than 40 % of items missing in their baseline
assessment of depression. When looking at individual items, there were no items with
more than 5 % of total responses missing in the baseline data. Nine CPT clients and
three CETA clients had no follow up scores due to not receiving follow up or lost
records. Among those who had follow up scores recorded, there was one client in the
CPT trial who had more than 40 % of the items missing in the function scale and one
client in the CETA trial who had more than 40 % of the items missing in the trauma
scale. When looking at individual items, there were no items with more than 5 % of
total responses missing in the follow up data among those with recorded follow up.

Missing data on demographic variables were imputed based on all other demographic
variables, the counselor id-number and treatment status. We then imputed missing baseline
and follow-up scores using all of the variables in the dataset including treatment
or control status. CETA and CPT participants were imputed separately. Average scores
for all outcome variables were then calculated using 11 imputed datasets. We did not
do any data transformations. All final outcome models were run using the 11 imputed
datasets.

For each outcome measure, we calculated the net difference in mean score between intake
to follow-up and between intervention and control participants, along with the effect
size of the intervention. Treatment effects were determined using longitudinal, multilevel
models with CMHW and client as random effects, and a time by group interaction with
robust variance estimation, to test for the net difference in mean score for each
outcome between the baseline and follow up interview. We decided to use the CMHW and
client as random instead of fixed effects based on the results of the Hausman test
with significance set at p??0.05 52]. The significance level for treatment effects was p?=?0.05, two-tailed, expressed as a 95 % confidence interval. Cohen’s d was used to
calculate the size of the effect over and above the change experienced by the WLC
participants. Cohen’s d was calculated using the difference in differences in outcomes
between groups as the numerator, and the pooled standard deviation at baseline as
the denominator 53]. The following interpretation was used for effect size: 0?=?no effect; 0.2?=?small
effect; 0.4?=?moderate effect; 0.8?+?=?large effect 54]. All analyses used the full intention-to-treat (ITT) sample.

Sensitivity analysis

In the primary analysis, the regression models were not adjusted with added covariates
such as age, gender, educational level. The dependent variable (mean scale score)
was modeled against only two independent variables: intervention status (intervention
or control) and time (time 1 or time 2). We assumed that the randomization process
was sufficient to make the intervention and control groups equal for the main analysis
(the unadjusted model). As a check, we did a sensitivity analysis that adjusted the
regression model with additional independent variables (the adjusted model) such as
age, gender, education status, working status. If the findings are similar, this provides
more confidence that the randomization process sufficiently equalized the intervention
and control groups. This model included variables that differed at baseline between
intervention and control or were associated with changes in outcome measures defined
as p??0.10. In the final adjusted model all variables used for adjustment were also centered
at their means.