Next generation sequencing of triple negative breast cancer to find predictors for chemotherapy response


Patients and tumor biopsies

Pre-treatment biopsies and peripheral blood samples were obtained from 56 patients
with untreated, primary triple negative breast cancer. All patients had received neoadjuvant
treatment at the Netherlands Cancer Institute between 2004 and 2013 as part of ongoing
clinical trials, or were treated off protocol according to the standard arms of one
of these studies 6] (NCT00448266, NCT01057069). The ethical committee of the Netherlands Cancer Institute
approved the studies and all patients gave informed consent (reference numbers of
Ethical approval: PTC06.1725/N06IAA and PTC09.2716/M09TNM). All tumors were either
at least 3 cm in size, or the presence of axillary lymph node metastases had been
established by fine needle aspiration or pre-treatment sentinel node biopsy. Biopsies
were taken using a 14-G core needle under ultrasound guidance. After collection, specimens
were snap-frozen in liquid nitrogen and stored at ?80 °C. Each patient had two or
three biopsies taken to ensure that enough tumor material was available for both diagnostic
and research purposes.

All patients started NAC with three courses of 2-weekly administrations of cyclophosphamide
and doxorubicin (ddAC). The selection of the chemotherapy regimen for the fourth to
the sixth courses depended on the specific clinical trial and could consist of continuation
of ddAC in the case of a favorable magnetic resonance imaging (MRI) response (for
criteria see previous publication 7]), randomization to intensive carboplatin-based chemotherapy or a switch to a combination
of capecitabine and docetaxel in the case of an unfavorable MRI response. All patients
subsequently underwent surgery, either breast conserving or mastectomy. The clinical
results of these strategies have not been published because the studies are ongoing
(NCT00448266, NCT01057069).

Pathology

Triple negative status was defined by the absence of ER and PR expression and no amplification
of HER2. ER (Roche Diagnostics cat. no. 5278406001 (Roche, Basel, Switserland))- and
PR (Roche Diagnostics cat. no. 5278392001)-negativity was defined as immunohistochemical
(IHC) staining of fewer than 10 % of tumor cell nuclei. None of the samples in this
cohort had between 1 % and 10 % positive tumor cell nuclei, so all samples fulfilled
the ASCO/CAP criteria for triple negativity. Negativity for HER2 (Roche Diagnostics
cat. no. 5278368001) amplification was present if IHC staining was graded as 0 or
1+. In the case of 2+ or 3+ staining, chromogenic in situ hybridization was performed
to determine HER2 amplification (gene copy number ?6 per tumor cell). Chemotherapy
response was assessed by microscopic examination of the surgery resection specimen.
The complete absence of any invasive tumor cells in both the breast and the lymph
nodes was considered as a pCR. All other responses were assigned to the no-pCR group.
An experienced breast cancer pathologist (JW) reviewed all pathology slides.

BRCA analysis

The series was well-characterized for BRCA function. Briefly, germ line DNA was isolated
from peripheral blood lymphocytes of affected patients. We used mutation-scanning
methods. The protein truncation test was used for exon 11 of BRCA1 and exons 10 and 11 of BRCA2. The remaining exons were tested using denaturing gradient gel electrophoresis. Confirmation
of aberrant samples was done by Sanger sequencing 8]. In addition, multiplex ligation-dependent probe amplification (MLPA) was performed
using MLPA kit P087 (BRCA1) to detect large genomic deletions or duplications in the genes.

Hypermethylation of the BRCA1 promoter was determined using methylation-specific MLPA analysis, according to the
manufacturer’s protocol (MLPA assay ME005-custom, MRC-Holland, Amsterdam, The Netherlands).
For normalization and analysis the Coffalyzer program was used (MRC-Holland). According
to the manufacturer’s protocol, we used a cutoff of 20 % to call a sample methylated
9]. Employing this cutoff, methylated samples have very low BRCA1 gene expression 10], 11], which was also true for this series (Additional file 1: Figure S1).

Library preparation

DNA was isolated from frozen biopsies with a minimal tumor percentage of 50 %. Tumor
cell percentage was assessed by an experienced breast pathologist (JW). Isolation
was performed with (Qiagen, Venlo, the Netherlands) DNA mini kit. Matched normal DNA
was obtained from peripheral blood and extracted by DNAzol and purified with Qiagen
DNeasy kit. Samples were interrogated by a designed “Cancer mini-genome” consisting
of 1,977 cancer genes 12]. This set comprises genes involved in DNA repair, cell cycle, apoptosis, epigenetic
modification (methylation, acetylation), angiogenesis (vascular endothelial growth
factor (VEGF) pathway), and genes from the epidermal growth factor receptor (EGFR),
mammalian target of rapamycin (mTOR), insulin, TP53, transforming growth factor (TGF)-beta,
Notch, Wnt and hedgehog pathways, known tumor suppressor and proto-oncogenes, and
all additional kinases not present in any of the above mentioned processes. Genes
are listed in Additional file 2: Table S1. Barcoded fragment libraries were generated from 300–600 ng of isolated
DNA from tumor and reference samples as previously described 13]. Pools of libraries were enriched for 1,977 cancer-related genes using (SureSelect
technology, Agilent, Santa Clara, California, US). Enriched libraries were sequenced
to achieve an average coverage of at least 150× on a (SOLiD 5500xl, Applied Biosystems,
Waltham, Massachusetts, US) instrument according to the manufacturers’ protocol. Sequencing
statistics are available in Additional file 3: Table S2.

Sequence data analysis

Sequence reads were mapped on the human reference genome version 19 (GRCh37), using
BWA (?c –l 25 –k 2 –n 10) 14] and variant calling was done using a custom pipeline identifying variants with at
least 10× coverage, a 20 % allele frequency, and multiple (?=?2) occurrences in the
seed (the first 25-bp most accurately mapped part of the read) and support from independent
reads (?=?3). Validation of our custom variant-calling pipeline is described in Nijman
et al. 15]. Putative somatic variants were identified by subsequently genotyping all variant
positions in the raw datasets of both the tumor and reference sample using samtools
mpileup, to ensure the absence of variant alleles in the reference sample; only positions
showing less than 5 % variant alleles in the reference sample were considered somatic.
Next, somatic variants with a minor allele frequency 0.01 in non-Catalogue of Somatic Mutations in Cancer (COSMIC) populations in dbSNP were filtered out, as were variants seen??= 1× homozygously
or??= 3× heterozygously in reference samples in our in-house database and variants
with an allele frequency 25 % if the coverage was 50×. Finally, manual curation
was performed to remove obvious noise in poorly mapped regions. For the analysis of
association with clinical data and the pathway analysis, we included only putative
driver mutations defined as either 1) mutations with a major effect on protein expression,
defined as a) truncating mutations, indels, splice acceptors/donors, initiator codon
variants or b) non-synonymous mutations predicted to be damaging according to PolyPhen
16] and/or SIFT 17], or 2) known COSMIC mutations.

Copy number analysis was done by comparing depth of coverage in reference and tumor
samples using the robust z score according to the method of Iglewicz and Hoaglin 18]. Briefly, the median and median absolute deviation of coverage per exon of all reference
samples, normalized on the total number of reads per sample, was calculated to determine
the coverage distribution. The z -scores (i.e., the difference between the coverage of the tumor sample and the median
coverage of the reference pool of normal samples, multiplied by a correction factor
of 0.6745, and divided by the median absolute deviation of the reference pool) were
calculated for all exons of each tumor sample. The z score of the total gene was calculated by taking the average of the z scores per exon. As we noticed an effect of GC-content on the exon-coverage profiles
in some, but not all samples, we used unsupervised hierarchical clustering of the
normalized coverage per exon of all samples. Two clear clusters were formed, but 12
samples (including both tumor and reference samples) could not be placed in either
cluster. Exon coverage of the tumor samples was then compared to the reference samples
within the same cluster. Genes with z scores??= 10 were considered high-level amplifications; a z score??= 3 indicates copy number gain and??= ?3 copy number loss.

Sanger sequencing for validation of PIK3CA mutations

Samples were analyzed by Sanger Sequencing using a BigDye Terminator Cycle Sequencing
Kit and ABI 3730 capillary sequencer (Life Technologies/Applied Biosystems, Waltham,
Massachusetts, US).

Statistics: association of sequencing data with clinical information

Statistical analyses were performed with R/Bioconductor 19]. We focused on three clinical variables: 1) pCR, 2) relapse and 3) BRCA proficiency.
We encoded each of the clinical variables into a binary variable and then studied
associations at the gene and at the pathway levels: association between mutations
in an individual gene and the given clinical variable was tested using Fisher’s exact
test of the presence/absence of a mutation in a sample and the presence/absence of
the clinical variable. The pathway analysis was based on a list of 29 pathways from
the Kyoto Encyclopedia of Genes and Genomes (KEGG), including at least three genes
targeted in the sequencing experiments. We considered a pathway to be mutated if at
least one of its genes was found mutated. We then performed the Fisher’s exact test
of the presence/absence of a mutation in a given pathway and the presence/absence
of the clinical variable in the various samples. All tests were corrected for multiple
testing using the Benjamini-Hochberg method. In all analyses, associations were considered
significant when the adjusted p value was 0.05.