Opioid substitution and antagonist therapy trials exclude the common addiction patient: a systematic review and analysis of eligibility criteria

This study provides an overview of the limitations of OSAT literature, using multiple
resources including results from a well-designed systematic review, 41] and an application of the findings within a clinical sample of opioid-dependent patients
45]. This study also provides a systematic assessment of the guidelines, highlighting
the important limitations we can work to improve for the future.

Results from this systematic review suggest trials most often include adult patients
meeting the DSM-IV/ICD criteria for opioid dependence with intravenous drug use behavior
and a past history of methadone treatment. Trials most often exclude participants
having a psychiatric or chronic physical comorbidity, current alcohol or substance
use problem, as well as those taking psychotropic medications. When applying these
criteria to a clinical sample of methadone patients we found them to be largely restrictive,
and in some cases render 70 % of the GENOA sample ineligible. Criteria such as the
exclusion of participants with psychiatric or physical comorbidity, concurrent alcohol/substance
use problems, as well as those using psychotropic medication appeared to have the
largest cost to recruitment, where more than 50 % of the GENOA sample would be lost
by the application of such criteria.

The majority of international clinical practice guidelines rely on out-of-date systematic
review evidence to inform guidance development as well as making strong recommendations
based on many of the trials with strict eligibility that we assessed in our review.
Assessment of both Canadian and American guidelines revealed concerning practices,
where both panels provide numerous trials as evidence supporting recommendations for
different opioid substitution treatments without once discussing the impact of trial
eligibility criteria 12], 13]. The guidelines neither acknowledge the restrictive design of the trials or the generalizability
of the evidence 12]–14]. The guidelines go so far as to rank the quality of the evidence as good, despite
the concerning limitations we have raised for each of the cited studies 12], 13]. These issues are highlighted further when we applied the combined eligibility criteria
reported by trials cited in the guidelines to the GENOA sample, whereby the highest
number of GENOA participants these studies could have been generalized to would include
20 people out of 394.

Additionally, when applying the AGREE II rigor of development and applicability domains
we found the North American guidelines performed considerably worse in using systematic
search methods to identify research, and reporting the limitations as well as generalizability
of the evidence. These practices were contrary to the transparency of reporting found
in the WHO and UK guidelines. Use of the Grading of Recommendations Assessment, Development
and Evaluation (GRADE) of guidelines criteria is likely impacting the stark quality
differences between North American and European guidelines 59]. European guidelines provide transparent appraisal of the evidence used to inform
recommendations and even go so far as cautioning the application of the evidence in
psychiatric or criminal populations 52]. These findings suggest the need for North American guideline committees to evaluate
and impose the critical evidence synthesis approaches utilized by the WHO and NICE.

Understanding the evidence and the need for change

The use of restrictive eligibility criteria is often set by investigators to ensure
the safety for recruited patients to the new intervention being tested and to maximize
the chances of observing a treatment effect under optimal conditions. Prior to phase
III trials, interventions have never been tested in randomized comparison design.
Entry into phase III efficacy trials is governed by restrictive eligibility criteria
and conduct is often controlled by rigid protocols. This may explain why many of the
trials identified for this review adhered to an explanatory design.

Testing the effect of interventions on highly specific groups may be associated with
unintended harm to patients once the intervention is released for use in the general
population. Our results indicate the majority of opioid substitution therapy trials
exclude participants with major psychiatric disorders. This exclusion criterion is
in no way novel, in fact many trials exclude patients with psychiatric comorbidities.
What is concerning is the lack of understanding over what may happen to these populations
once the drug is released for wider use. For example, varenicline was tested in a
randomized double-blind placebo controlled trial to assess its efficacy for reducing
smoking 60]. This trial excluded participants with a history of psychiatric comorbidity including:
major depression requiring treatment within the past year, panic disorder, psychosis,
bipolar disorder, or anorexia nervosa or bulimia 60]. Upon Food and Drug Administration (FDA) approval for use in the general population,
many patients began to present with psychiatric symptoms including erratic behavior
and suicide attempt 61]. Many now criticize the pharmaceutical company for marketing the use of this drug
in the general population before knowing the real effect of the intervention in participants
with psychiatric comorbidities, especially since smoking is prevalent among patients
with psychiatric disorders. These side effects would have been better noticed had
the proper implementation trials taken place, or had the intervention been tested
in a more representative sample.

Future directions

There are important reasons why patients with certain comorbidities (physical, psychiatric)
are excluded from trials. Efficacy trials seek to determine whether the intervention
actually works under the appropriate conditions. When the objective of a trial is
to determine the effect and safety of an intervention an appropriate design would
be to test the intervention under optimal conditions. Patients with psychiatric comorbidities,
especially those with addictive disorders are known to have difficulty complying with
interventions, 62] and the inclusion of these participants can dilute the treatment effects we observe
in trials. This trend suggests the results from stringently designed trials reporting
small treatment effects are unlikely to withstand the inclusion of non-compliant patients.
For instance, results from one trial show a significant hazard ratio (HR) for the
comparison of buprenorphine to naltrexone for improvement of patient retention (HR:
1.56; 95 % confidence interval [CI]: 1.01, 2.41) 19]. This trial’s findings are fragile and unlikely to sustain a small number of changes
to the reported events across treatment arms, which would be the likely outcome had
the trial been performed in a more pragmatic patient population 19].

It is important we address the need for inclusion of non-compliant participants in
trials. Recognizing that patients already have challenges complying with the strict
methadone treatment regimes 62], this issue is further complicated for trials recruiting addiction patients because
of the high prevalence of concurrent psychiatric comorbidities 45], which are known to impact intervention adherence 62]. There have been suggestions for improving compliance among patients with psychiatric
disorders; one emphasized in the literature is the use of telephone and electronic
reminders 62], which has been demonstrated to improve compliance in non-psychiatric patients 63], 64]. Other ways to improve patient compliance with addiction treatment include focusing
on adherence through educational sessions with family health teams, consistent symptom
measurement, and appropriate treatment tailoring 65]. Instead of excluding these participants from trials and limiting the generalizability
as well as understanding of the treatment within a population representing a good
portion of addiction patients, we encourage trialists to design studies with features
that help increase patient compliance so to ensure an understanding of interventions
effectiveness in the clinical population.

We recognize there are also other important reasons guiding the design of strict protocols,
for instance the exclusion of pregnant women or patients with acute physical conditions
may be a protective measure in efficacy trials where drug safety is still being determined.
However, the real problem arises when we rely exclusively on evidence from these trials
before we can determine the effectiveness in a “real world” sample of different types
of patients (e.g., patients with comorbidities or on psychotropic medications). The
limitations of the strictly designed protocols that govern efficacy and safety of
trials have been largely addressed by the introduction of implementation trials, which
are investigations whose primary aim is to test treatments using flexible protocols
for participant selection and treatment administration. These trials often have lax
eligibility criteria with the aim of including the participant we will find in the
clinical practice population. However, these trials are not common in the field of
addiction medicine, restricting us to the use of stringent efficacy trials to inform
clinical guideline development.

Uptake in the planning and commitment to implementation or “effectiveness” research
will likely stem from an increase in discussion and acceptance of the need for pragmatic
trial designs. The goal in implementation trial design should be to maximize the safety
of participants included in the trial while also balancing the applicability of the
findings. Pragmatic trials – sometimes called implementation trials – are evaluated
on a continuum and should not be characterized by a specific set of criteria 66]. We recommend future trials in addiction medicine need not abandon all “explanatory,”
or more stringent designs but instead work to evaluate the intervention in a wide
range of participants as a secondary objective. Implementation trials should aim to
include those participants with psychiatric comorbidities, poly-substance use disorders,
and chronic physical conditions. Provided there are enough patients within each subgroup,
researchers may be able to evaluate the mediating impact of each comorbidity and with
confidence determine the true impact of physical or psychiatric abnormalities within
addiction patients.

Limitations

Reliance on a treatment sample of methadone patients may potentially impact the generalizability
of this study. Using the GENOA sample of participants provides a unique opportunity
to demonstrate the restrictive impact of eligibility criteria reported in the addiction
literature. However, demonstrating such an effect requires the use of a generalizable
sample of addiction patients. Participants recruited from the CATC comprise a treatment
sample, which may in fact have higher levels of comorbidities (both physical and/or
psychiatric) than patients earlier on in the cycles of addiction. By the time patients
are receiving pharmacological therapy for opioid dependence they are often at a later
stage in their addiction course, placing them at higher risk for exposure to HIV,
hepatitis, infectious disease, opioid-induced hyperalgesia, and poor social/economic
living conditions. In addition, patients may only seek treatment once their physical,
psychiatric, or social functioning is seriously impeded. In fact, the GENOA sample
may lack the population of patients experiencing the range of problems which often
coerce or force individuals into treatment altogether. However, we must also acknowledge
that addiction is a complex disorder, often accompanied by serious physical and psychiatric
comorbidities. Incident misuse of opioids is known to result from serious physical
comorbidities such as pain 67] and from suffering experienced as a consequence of anxiety or depression 68]. Recognizing there may be discordance between prevalent users currently seeking treatment
and more “incident” cases, we maintain the clinical profile of incident users also
reflects a high degree of mental and physical abnormality. We emphasize the CATC population
of patients may be a prevalent sample of addiction treatment-seeking patients, and
that the results from these trials are generated to inform the treatment of such populations,
and as such it is still important we demonstrate the large effect these criteria may
have on weakening the directness of the evidence.

We acknowledge that observational studies are subject to selection bias, whereby patients
agreeing to participate in the study may reflect a different population. To evaluate
such bias we have elected to compare the demographic and clinical characteristics
of participants in the GENOA study to a sample of CATC patients from four economic
and geographically diverse clinics. This sample of CATC patients includes population
data from four clinics and includes demographic and clinical characteristics data
from all patients actively receiving treatment from these sites. Please refer to Table
5.

Table 5. Comparison of the demographic and clinical characteristics of GENOA participants to
the general population of CATC patients

The GENOA sample was largely representative of the general CATC population, whereby
the mean age, mean methadone dose (mg/day), prevalence of HIV, as well as marital
status was not statistically significantly different between samples. There were,
however, some differences, whereby the general CATC patients were shown to have a
higher prevalence of hepatitis C and on average a shorter treatment duration. Additionally,
the general CATC sample was made up of a higher proportion of men than found in the
GENOA population. These results do suggest that the GENOA participants are made up
of patients with longer treatment duration and as such these patients are likely susceptible
to having a higher number of physical or psychiatric comorbidities. However, these
results also suggest the CATC population has a higher level of hepatitis than our
sample, which may also suggest the GENOA study may be subject to a “healthy volunteer
bias,” whereby sicker patients are less likely to engage in the study. Inadvertently,
this would bias our own results toward the null and overall suggests more participants
in the “general” active treatment population would be excluded that we are purporting.

In Canada buprenorphine is not covered by the provincial drug insurance plans, and
as such these patients reflect either (1) an employed population with benefits covering
therapy, or (2) those patients who can afford out-of-pocket coverage. In light of
the administrative differences between methadone and buprenorphine coverage, we chose
to include only the sample of methadone patients in the GENOA investigation, which
could reflect a more marginalized population of drug treatment-seeking patients and
is a potential limitation to the generalizability of the GENOA sample.

Information regarding criminal offences, sexual behavior, and domestic conflict were
collected by self-report from patients agreeing to participate and thus likely underestimated
due to social desirability bias. We acknowledge social desirability bias may impact
the estimates for some of the demographic data collected in this study. However, we
maintain important variables, such as psychiatric comorbidity, were ascertained using
a validated questionnaire, the MINI. In addition, physical comorbidity (diabetes,
chronic pain) was evaluated using self-report and confirmed via information logged
by attending physicians in the patients’ electronic medical record. Urinalysis was
performed to ascertain poly-substance use. We aimed to include as many objective measurements
as possible, and when unavailable we relied on safeguards such as electronic medical
record confirmation.

Additionally, we may find that the definitions, measurements, and cut-offs (if relying
on measurement tools) used to assess for physical and psychiatric functioning across
clinical trials may be quite different than those used in the GENOA study. Thus, there
is potential that the exclusion criteria reported across trials and later applied
to the GENOA sample are being misused. Psychiatric comorbidity can vary from obvious
psychotic disorders to any anxiety or depressive disorder. Depending on the thoroughness
of the trial investigators and indeed the thoroughness of the clinicians administering
the GENOA assessment tools (MINI, BPI, MAP), differing rates of psychiatric problems
will be identified and could compromise the aims of our study. Due to the serious
limitations in reporting of the definitions and measurements for many of the eligibility
criteria discussed across the literature, we caution the interpretation of the application
of such criteria using the GENOA study.

Provided we had reliable data on the medical and demographic characteristics of opioid
addiction patients, we would be better equipped to demonstrate the clinical guidelines
are not appropriate for the US and UK populations. Administrative data provided by
Health Maintenance Organizations in the US or the National Health Service in the UK
could serve as sources for population-level data. However, the quality of this data
is questionable due to the high susceptibility for misclassification. A recent study
evaluated the misclassification of psychiatric disorders based on the comparison of
medical records and administrative data and found only moderate agreement for any
mental comorbidity 69]. We acknowledge the problems associated with opioid dependence are impacted by the
health, social, and judicial systems, which can vary across countries. However, to
say the prevalence of psychiatric and physical comorbidity, as well as prescription
of psychotropic medication varies so much between countries, as well as types of addiction
populations such that it would render the larger message of this study insignificant
is improbable.