Psychometric properties of the incontinence utility index among patients with idiopathic overactive bladder: data from two multicenter, double-blind, randomized, Phase 3, placebo-controlled clinical trials


Sample

OAB cases included in the present analysis were adult patients with idiopathic OAB
and UI, participating in two multicenter, international, Phase 3, randomized, double-blind,
placebo-controlled studies to evaluate the safety and efficacy of a single treatment
of BOTOX® (onabotulinumtoxinA, Allergan Inc.). Eligible patients had idiopathic OAB
with UI, were considered to be inadequately managed by anticholinergic therapy (insufficient
efficacy or intolerable side effects), and experienced ?3 episodes of urinary urgency
incontinence (UUI) in a 3-day patient bladder diary, an average of???8 micturitions
per day and a post-void residual urine volume ?100 ml. A full description of their
characteristics and study design can be found elsewhere 27], 28].

In order to test the differential item functioning by sample etiology, the sample
originally used for item selection of the IUI is also included in this study. This
sample was comprised of patients with UI due to NDO as a result of spinal cord injury
or multiple sclerosis, recruited during two multicenter, double-blind, randomized,
placebo-controlled, parallel-group studies 29], 30].

Ethics, consent and permissions

Before entering in the aforementioned clinical trials all patients had to provide
their written informed consent and all the studies were conducted in full compliance
with the ethical principles regarding human experimentation of the Declaration of
Helsinki. The New York University School of Medicine IRB, with Federal Wide Assurance
number 00004952 reviewed the studies.

Clinical variables and outcomes measures

Basic clinical and socio-demographic data were collected to describe the sample of
the study. The following patient reported outcomes (PROs) were administered to patients
in these studies:

Treatment Benefit Scale (TBS) 31]: a single-item measure that evaluates patients’ perception of benefit following treatment.
Responses are defined as 1?=?greatly improved; 2?=?improved; 3?=?not changed; 4?=?worsened.
The TBS has demonstrated validity and responsiveness in previous clinical trials with
antimuscarinic treatments in patients with OAB 31].

Short Form-12 Health Survey version 2 (SF-12) 14], 32], 33]: A generic HRQoL questionnaire that includes 12 items from the SF-36 Health Survey
34] and has two component summary scores (physical –PCS- and mental –MCS-). Scores from
patients’ responses are normalized to a distribution of 50?±?10 (using U.S. general
population norms), and a higher score indicates better HRQoL. It has demonstrated
adequate psychometric properties to estimate the health burden of chronic conditions
in general population surveys 33], 35]. A preference-based weighted index can be estimated from this instrument (SF-6D)
following the models proposed by Brazier et al. 14]. The range of the observed utility values varies between 0.30 (worst health state)
to 1.0 (best health state) 36].

King’s Health Questionnaire (KHQ) 37], 38]: This self-administered disease-specific HRQoL questionnaire for UI has 21 items
(4-point Likert scale) covering 8 domains: urinary symptom severity, role limitations,
physical functioning, social functioning, emotional problems, personal relationships,
sleep disturbance, and general health. The range of scores for each domain is between
0 and 100, with higher scores indicating greater impact on patients’ HRQoL (worse
perceived health status). A condition-specific preference-based index has been developed
from this instrument through reducing the number of dimensions (from 8 to 5), and
valuing the resulting health state classification framework using direct elicitation
(standard gamble) in a representative sample of patients with UI attending UK hospital
outpatients clinics. Different models were tested to better adjust the predicting
valuations for all the possible health states defined by this utility measure (?=?1024) 38]. Mean utility values obtained with the KHQ range from 0.77 to 0.98.

Incontinence Quality of Life Questionnaire (I-QOL) 22], 23], 26]: This self-administered questionnaire comprises 22 items (5 Likert point) distributed
into three dimensions. These principal domains are: avoidance and limiting behavior
(items 1–4, 10, 11, 13 and 20), psychosocial impact (items 5–7, 9, 15–17, 21 and 22),
and social embarrassment (items 8, 12, 14, 18 and 19). A total scale score is calculated
by summing the scores of all items included in an scale and transforming them into
a 0–100 scale (higher scores reflect better HRQoL) 22].

The abbreviated health states classification system comprises 5 items or attributes
(Fig. 1). Utility scores can be derived from this instrument by applying the IUI algorithm:
IUI utility score?=?1.051 (b1 * b2 * b3 * b4 * b5)—0.051, where b is the estimated
weight attached to the 3 different levels of each of the 5 attributes. The IUI has
a utility score ranging from 0.036 (worst health state) to 1 (perfect health) 21].

Fig. 1. Incontinence Utility Index (IUI) attributes and levels

Statistical approach

A number of analyses were undertaken to assess the performance of the IUI in this
sample of OAB patients:

Rasch analysis was used to check for the presence of differential item functioning
(DIF, or measurement bias) of the I-QOL between patients with OAB and NDO patients
from which the IUI was originally developed. Items were calibrated and subjects were
scored using the Partial Credit Model 39]. DIF was tested by segmenting the sample by etiology 40]. The presence of DIF suggests that patients with the same disease severity tend to
respond differently to an item depending on their disease etiology. DIF was considered
significant with p??0.001, and relevant if the difference between groups exceeded
0.5 logits.

In addition, further tests were applied to assess the psychometric performance of
the abbreviated health state classification defined in the IUI and the associated
utility scores in this sample of OAB patients.

Concordance between I-QOL versions

The agreement between the original I-QOL and its abbreviated health state classification
system in OAB patients was analyzed by applying the intraclass correlation coefficient
(ICC) at baseline and at week 12. Two-way mixed effects models (people effects are
random and measures effects are fixed) was applied with an absolute agreement type
with average measures (reliability of the mean of the instruments). It is generally
recommended that ICC be at least 0.7 41], with higher values indicating better concordance.

Validity

Criterion validity was assessed by studying the differences in both the I-QOL and
the abbreviated form according to TBS scores at Week 12. Kruskal-Wallis (with Bonferroni’s
correction for multiple comparisons) tests were conducted for this purpose. Statistically
significant differences between TBS levels (p??0.05) were expected in the scores
of the I-QOL and its derived measures. Next, the relationship between the I-QOL, its
abbreviated health state classification system, and the IUI with other PRO and clinical
variables were evaluated. Spearman rank correlations coefficients (rho) were calculated
between these instruments and the following clinical variables: age, volume voided
per micturition (mL/24 h), daily incontinence episodes, daily urgency episodes, daily
micturition episodes, daily incontinence urgency episodes, number of daily nocturia
episodes and weekly incontinence episodes. Finally, convergent validity of the I-QOL,
the abbreviated health state classification system, and the IUI were studied by testing
its association (rho) with respect to the KHQ global score and domains, the SF-12
(Physical and Mental Component Summaries) and the utility scores derived from the
KHQ and the SF-12 using Spearman rank correlation. A moderate to strong association
(rho ?0.5) was expected between the I-QOL and other disease-specific variables (e.g.,
disease-specific domains of the KHQ or the number of incontinence episodes), while
the association between the I-QOL and variables less directly related to OAB (e.g.,
generic domains of the KHQ and the SF-12 summary components) was hypothesized to be
lower (rho between 0.3 and 0.49).

Responsiveness

The ability of the abbreviated form of the I-QOL and the IUI to capture clinically
relevant changes in OAB patients were analyzed according to the level of response
to treatment and TBS scale. To this end, differences in scores between baseline and
Week 12 visits were calculated via Wilcoxon tests, standardized response means (SRMs)
and effect size statistics. Patients were classified as respondents according to clinical
criteria depending on the average percentage reduction in daily UI episodes from baseline:
50 %, 75 % or 100 % of reduction in daily episodes according to the 3-day bladder
diary at week 12 42]. SRMs were calculated as the mean change score (score at week12 minus score at baseline)
divided by the standard deviation of change score. Effect size statistics were calculated
as the mean change score divided by the standard deviation at baseline 8]. It was hypothesized that higher effect sizes were to be found between patients with
a higher response or perceived benefit at week 12. Conventional benchmarks to interpret
effect sizes are as follows: an effect size of 0.2–0.49 is considered small, 0.5–0.79
medium and over 0.8, large 43].

Agreement between utility measures

Bland-Altman diagrams for agreement and the ICC statistic (with a two-way mixed effects
model and checking an absolute agreement type with average measures) at baseline and
Week 12 were used to study to what extent the utility values from the IUI and those
obtained from the SF-12 14] and the KHQ 38] could be interchangeable. The limits of agreement in the Bland-Altman figures were
defined at a distance from the mean of 1.96 times the standard deviation of the differences.
Acceptable limits of concordance of?±?0.1 points from 0 (maximum concordance in mean
values) are also displayed to represent relevant discrepancies in common utility scales
44], 45].

The statistical packages SPSS 21.0 and Winsteps 3.75 were used to conduct the analyses
above.