Patient perceptions of foot disability in Juvenile Idiopathic Arthritis: a comparison of the juvenile arthritis foot disability index and the Oxford ankle foot questionnaire for children


The results have shown that there were varying results when median domain scores were
compared between the JAFI questionnaire and the OxAFQ-C questionnaire. When domains
were compared that contained questions relating to similar activities, no significant
difference in domain scores were seen. This was shown for the JAFI participation domain
and OxAFQ-C school domain. The author had expected to find some agreement as the JAFI
was used in the development of the OxAFQ-C. However there is limited usefulness in
comparing across the domains between the questionnaires as there are clear differences
in the themes being probed by the statements. For example, the OxAFQ-C has emotional-themed
questions in the emotional domain that considers areas such as being “bothered” by
the way the individual walked or how their foot / ankle looked. Such areas are not
addressed in the JAFI in a comparable domain as the two emotional-themed statements
(being worried or sad about foot problems) are included in the activity limitation
domain alongside 11 activity-based questions. Thus the QxAFQ-C emotional domain was
not analysed against a JAFI domain in this study. The JAFI activity domain and the
OxAFQ-C physical domain were also compared and a significant difference in the median
results was found suggesting that the questions asked in each domain were investigating
different areas impacting on well-being.

Considering the PROMs from a more general view whereby all the questions are considered
without being confined to domains but in a composite score, the study showed that
there was a lack of agreement between the questionnaires. Although neither of the
questionnaires’ authors suggest the use of a composite score, the OxAFQ-C has been
used as a composite score in one previous study 10]. A similar questionnaire, the MOXFQ, has also recently been used as a composite score
and this has been recognised as being an acceptable method 7]. The lack of agreement identified between the composite scores of the JAFI and the
OxAFQ-C in this study suggests that there are sufficient differences in the two scores
such that they are either measuring differing aspects of the perception of well-being
through asking different questions or they are identifying different levels of perception.
The Bland Altman Levels of Agreement calculation identified a mean difference in summed
composite scores of close to four percentage points. The determination of the level
of agreement is based primarily on clinical judgement. On initial consideration a
mean difference of four percentage between scores would seem to be acceptable clinically.
However the Bland Altman plot allows visualisation of how each participant’s scores
varied. The plot shows that many participants (14 out of 35) had scores that were
more than ten percentage points apart which this author felt would be beyond the clinical
acceptability. Although made arbitrarily, this decision was based upon the suggested
value of 6-8 percentage points which was the value for meaningful change (MDC) in
score for the OxAFQ-C 6]. Several participants had scores with differences close to two standard deviations
beyond the mean difference, which is the recommended level used to demonstrate disagreement
by Bland and Altman and thus overall, the scores were judged to give differing results.
The use of the total composite score has not been investigated in the OxAFQ-C or the
JAFI and although a percentage composite score is recommended for each individual
domain in the OxAFQ-C, combining all three domains into a single score may mask some
important outcomes within the individual domains and “smooth” the overall outcome
of the score. Whether this would account for difference between the scores identified
in this study is unclear.

The Pearson correlation (r?=?0.86) shows that there is a strong association between
the scores such that when the JAFI score increases then the OxAFQ-C also increases
despite the fact that individual percentage summed composite scores are different.
The existence of this association does add strength to the construct validity of each
questionnaire which is typically tested by comparing differing questionnaires that
test a similar theme.

The data collected in this study is not sufficient to, and did not aim to, identify
which questionnaire best reflects the impact on well-being of foot problems in JIA.
There have been concerns in past studies that the JAFI was not sufficiently sensitive
to identify mild disease having a “floor effect” 2] but since then other studies have shown a strong relationship between disease activity
and the JAFI 8], 11]. The JAFI should probably be favoured for use in the JIA population by virtue of
the questions that are specific to this condition, such as morning stiffness, morning
pain and the presence of joint swelling which should identify when the typical JIA
foot pathologies of synovitis, tendinitis and enthesitis are present. But that is
not to say that the OxAFQ-C is not useful as it is quicker to complete and has a focus
on emotions that may be of value in the teenage population where issues of image can
become increasingly important.

When considering the lack of agreement between the scores, the difference in the descriptors
may also have played a role in the outcome of this study. Both questionnaires used
the descriptors of “never” and “always” to define the end points of the Likert scale,
with “sometimes” as the centre value. But the JAFI used descriptors of “occasionally”
and “frequently” to complete the range whilst the OxAFQ-C used “rarely” and “very
often”. These could have been interpreted differently by the participant and thus
lead to the differing scores that have been identified.

A point worth noting regarding the use of the JAFI is that in the activity limitation
domain, the meaning of the Likert descriptors is reversed for 11 of the 13 statements.
This occurs as the statements change their direction of impact. In the impairment
domain the statements are phrased so that the use of the response “always” (score?=?4)
relates to a poor response so for example “I have morning stiffness in my foot/feet”
– a child with a high impact would likely answer “always”. The phrasing changes in
the activity impairment domain such that the response of “always” would indicate a
good response so for example the statement reads “I can always take part in PE” –
if the child answers “always” this would be a good response yet it would still score
4 points. In this study, the scoring was reversed to be consistent with other answers
with a high score equating to a greater impact for all statements. Dekker et al 11] also noted this discrepancy and recommended that these questions are adjusted to
give consistency across the score.

Despite not suggesting use of the summed composite score for the questionnaires, it
would seem to be appropriate to use the composite score when comparing patient groups,
whereas the individual domain scores may be more useful for identifying specific areas
needing treatment, for example, improving participation in school sports, running
ability or addressing footwear concerns. The composite score also allows for a meaningful
change in score to be defined. Morris et al 6] suggested that a difference of half the standard deviation of the group would represent
a true change beyond the measurement error of the tool. But further work is required
in this population in order to determine the minimal clinically important difference
(MCID) for each of the PROMs. The MCID is useful to determine the impact of treatment
through an improvement in score, but also as a monitoring tool so that any deterioration
in outcome beyond the minimally important difference can be recognised and acted on
quickly.

Further work on identifying a threshold of impact for the questionnaires would also
be useful, perhaps to form a range for “normal” – feet with minimal biomechanical
dysfunction and inactive joint disease – through to more severe levels of impact.
This would be useful so that the feet considered most at risk from the inflammatory
process (with or without biomechanical complications) can be identified and followed
closely but also so that any medical professional within the multidisciplinary team
can apply the PROM and determine when treatment is indicated and referral to a foot
specialist is needed.

Disease duration did not correlate with either questionnaire score. It had been expected
that those participants with longer disease duration might show a greater impact of
the disease on their feet through their actual summed composite score. However no
association was seen and this has also been found in another study 12]. Disease duration is not an indication of disease activity since despite being diagnosed
for a long time, the disease may be mild or destructive, well controlled or poorly
controlled and this seems to be more dependent on disease subgroup 13] and access to care 14] than duration, thus disease duration was too crude to identify correlations, between
severity and the actual summed composite scores, on well-being.

This study was subject to some methodological limitations which might be improved
in future studies. As JIA is a condition that can flare and remit and transient pain
might occur with increased physical activity, the two questionnaires did need to be
complete within a short time of each other. For pragmatic reasons, in this study the
questionnaires were completed together but because the process of answering 42 statements
may have been arduous, there was the potential that the final statements would not
be answered accurately. Using software such as SurveyMonkey does allow observation
of the length of time that participants take to complete the questionnaires. For this
study, the average time for completion of all statements was 6.7 min (SD?=?1.4 mins)
which was not as long as anticipated. For future studies it might be useful to format
the questionnaires in differing orders for each of the participants to reduce the
impact of less care being taken on the final statements. Neither of the questionnaires’
authors reports the need to answer the questions in a set order but it is noted that
the emotional statements are at the end of each questionnaire, after the subject has
focused on the limitations caused by the condition. This might be important to prepare
the subject for the emotional statements and thus rearranging the statements to equalise
the potential for rushing the final statements might be inappropriate. The impact
of both fatigue in answering questions and the order of the questions in questionnaires
is well recognised 15].

Questionnaires such as the JAFI and OxAFQ-C which require the participants to remember
events occurring only in the last week, are subject to limitations such as telescoping
whereby the participant remembers more dramatic events as being more recent than they
actually were. They are also subject to selective recall (confirmation bias) when
subjects remember events with confirm the ideas they have already formed. Although
these are examples of limitations affecting this type of questionnaire, they are not
expected to alter the outcome of the study as they should apply to both questionnaires
equally.

There was some small chance that the inclusion criteria for the study were open to
abuse. Although only publicised on the CCAA website, this is an open site and it is
possible that people other than those with JIA may have chosen to enter the study,
or those with JIA, but outside of the inclusion criteria may have entered. Within
this study, no exclusion was made for other conditions that might affect the feet
such as talipes, tarsal coalition or neurological conditions. Having JIA does not
prevent other foot conditions coexisting and therefore, in order to have a sample
representative of the normal JIA population, it was decided not to exclude these conditions
despite the fact the these conditions are likely to affect the score for each questionnaire.

Both questionnaires are designed for the children from the age of 5 years old to answer
on their own. This study did allow the parents to assist the child and therefore the
parents may have influenced the outcome of the questionnaires, however it has been
recognised that parents are able to rate the consequences of the disease on their
child 16]. The decision to allow parents to help was taken as it was felt that the questionnaires
were quite complex for the younger children and it also reflected the situation in
clinic where even older children would ask parents to help determine the most appropriate
answer. The parental influence on both questionnaires would be expected to be equal
and therefore not to impact on the results of this study.