Classification of fibroglandular tissue distribution in the breast based on radiotherapy planning CT


Patient dataset

The study datasets comprised planning CT scans of 23 patients. Datasets were originally
collected for a comparison of prone and supine positioning for breast radiotherapy
7], 8] which was approved by the Royal Marsden Committee for Clinical Research and the National
Health Service Regional Ethics Committee (London-Surrey Borders REC). Written informed
consent was obtained from all the patients for participation in the study. Patients
had undergone breast conservation surgery, during which time up to six pairs of surgical
clips were placed to define the excision cavity boundaries. The patients received
CT imaging for radiotherapy planning (from cervical vertebra 6 to below the diaphragm).
FT distributions of the patients’ breast were visually assessed by an observer (EH)
and grouped into non-sparse FT group (Group 1) and sparse FT group (Group 2). The
grouping was reviewed and agreed by a consultant radiation oncologist specializing
in breast radiotherapy 3].

Radial Glandular function (RGF)

The radial glandular fraction (RGF) presented by Huang et al. 5], was developed using coronal images acquired using breast CT with patients positioned
prone. The patient datasets used in this study were axial CT images, acquired with
patients positioned supine, which is standard practice for all the patients undergoing
breast radiotherapy treatment. These data were processed to produce a breast orientation
equivalent to that used by Huang et al. 5] using the following steps: (1) segmentation of the whole breast from the axial CT
images using clinician outlining, and (2) transformation, resampling and rotation,
using bilinear interpolation were applied to the segmented breast to obtain the desired
image orientation. Prior to rotation the whole breast 3D data were re-sampled, to
produce cubic voxels (1x1x1 mm
3
). The re-sampled breast was rotated about the superior-inferior axis by the acute
angle formed between anterior-posterior axis and a line passing through the nipple
and perpendicular to chest-wall, see Fig. 2.

Fig. 2. Pre-processing of supine breast data: a Original scan with vector showing orientation of center of breast; b Rotated image with center of breast at 0° to the vertical direction; c Coronal slice through b

For an image i, the breast radius, R, was calculated by equating the total area of
breast tissue to the area of a circle (Fig. 3a). RGF i
(r) of image i, was the fraction of pixels marked as FT on a circle with relative radial
distance, r, and its center at the image center of mass (Fig. 3b). The relative radial distance, r, is the circle radius divided by the breast radius R. For each image, one hundred
values of r were considered. The whole breast was evenly divided into three regions (Fig. 3c): the posterior breast (region 1), the middle breast (region 2), and the anterior
breast (region 3). The RGFs of the three breast regions were calculated by averaging
the RGF i
(r) over five images centered on slice s (s
1
, s
2
, or s
3
). A fibroglandular tissue segmentation method recommended by a previous study 3] was utilized in the current study.

Fig. 3. Description of radial glandular fractions (RGF) measurement: a Coronal CT image (processed, see Fig. 2): white circle encompasses the whole breast, and blue dotted circle is an example
circle with relative radius (r) for which RGF(r) is calculated; b RGF of the slice in a; c. Breast is evenly divided into three regions: posterior (region 1), middle (regions
2), and anterior (region 3) middle slices s
1
, s
2
, and s
3
, respectively

RGF features

RGF gives the proportion of fibroglandular tissue in the breast as a function of relative
radius, it may be considered a graphical representation of the FT distribution. For
classification of the spatial distribution of fibroglandular tissue, metrics describing
this distribution are needed. The RGF was characterized using the features listed
in Table 1 and presented in Fig. 4. These features, explained below, were investigated in order to classify breast fibroglandular
tissue distribution of individual patients. These features were: mean value and standard
deviation of the 100 RGF values for the corresponding relative radial distances, slope
of the linear fit of RGF values versus relative radius (r), radial position (r) of
the maximum RGF value, the minimum value of RGF, the maximum value of RGF, the difference
in the maximum and minimum values of RGF, mean of RGF values for which relative radial
distance was less than or equal to 0.5 (mean of inner 50 %), mean of RGF values for
which relative radial distance was greater than 0.5 (mean of outer 50 %), the difference
in the mean of inner and outer 50 %, mean of the highest 10 % of RGF values (mean
of highest 10 %), mean of the lowest 10 % of RGF values (mean of lowest 10 %), and
the difference in the mean of highest and lowest 10 %.

Table 1. Radial glandular fraction (RGF) features evaluated for the classification of the distribution
of fibroglandular tissue

Fig. 4. Example of radial glandular fraction (RGF) features (as listed in Table 1) that were evaluated for the classification of the fibroglandular distribution: (1)
mean value and (2) standard deviation of the 100 RGF values, (3) slope of the linear
fit of RGF values versus relative radius, (4) radial position (r) of the maximum RGF
value, (5) the minimum value of RGF, (6) the maximum value of RGF, (7) the difference
in the maximum and minimum values of RGF, (8) mean of RGF values for which relative
radial distance was less than or equal to 0.5 (mean of inner 50 %), (9) mean of RGF
values for which relative radial distance was greater than 0.5 (mean of outer 50 %),
(10) the difference in the mean of inner and outer 50 %, (11) mean of the highest
10 % of RGF values (mean of highest 10 %), (12) mean of the lowest 10 % of RGF values
(mean of lowest 10 %), and (13) the difference in the mean of highest and lowest 10 %

SVM classifier

The support vector machine (SVM) 9], 10], a widely-used classifier, was used to evaluate the classification performance of
the RGF features. The SVM constructs a maximum-margin hyper-plane in the high dimensional
input feature space, linearly separating the data points into two classes ensuring
the maximum gap between the classes. Though linear, this decision-boundary can be
rendered arbitrarily-convoluted with respect to the input space via the kernel-trick, in which inner-product relations within the SVM optimization function are replaced
by kernel functions, replicating the effect of a feature mapping. In this study, four kernels were chosen for evaluation within the SVM to cover a
representative range of behaviors; polynomial, radial basis function and sigmoid kernels
map the features into Hilbert spaces with differing characteristics, while the linear
kernel equates to retaining the existing feature space and the radial basis function
kernel maps into an infinite dimensional Hilbert space thereby guaranteeing linear
class separability on the training data. The sigmoid kernel derives historically from
work on Neural Networks, and exhibits an inherent quasi-classification-like aspect
that differentiates it from the other kernels. It should be noted that it is not possible
to say a priori which kernel will be better.

Classification performance for a group of features was measured with leave-one-out
cross-validation for all four mapping kernels: linear, polynomial of order 3, radial
basis function (RBF) with sigma value of 1 and sigmoid. The overall C parameter for the SVM was 1. Leave-one-out cross-validation used a single observation
from the original data as the validation set, and the remaining observations as the
training set. This was repeated such that each observation in the original data was
used once as the validation set.

Analysis

For each patient dataset, the texture features, listed in Table 1, were calculated and evaluated for their ability to classify the FT spatial distribution.
RGF features of the three breast regions for these patient groups were evaluated.
The differences in the fibroglandular composition (FC) (percentage of FT) of the breast
in the two groups were compared using the Wilcoxon rank sum test. For each feature,
the two group means were calculated and compared using the Wilcoxon rank sum test
to investigate their discriminative power. Four SVM classifiers with different mapping
kernels were used to evaluate the classification performance of the 13 RGF features
together. Classification performance was evaluated for the features from each of the
three individual breast regions and all the regions combined together. Performance
accuracy was calculated as a percentage of true identifications (both sparse and non-sparse)
out of total identifications.