Receptor Tyrosine Kinase Gene Expression Data (using RTPCR and Nanostring) from RNA samples of TCGA GBM Patients

Details of data taken from paper:

"Quantitative assessment and genomic context of intragenic receptor tyrosine kinase deletions in glioblastoma"

Submitted to: Acta Neuropathologica, March 2013

Corresponding Author:
Cameron W. Brennan
MSKCC
New York, UNITED STATES

Authors:
Edward R Kastenhuber, B.S.*
Jason T Huse, M.D., Ph.D.*
Samuel H. Berman, B.S.
Alicia Pedraza
Jianan Zhang
Yoshiyuki Suehara, M.D., Ph.D.
Agnes Viale, Ph.D.
Magali Cavatore
Adriana Heguy, PhD
Nicholas Szerlip, M.D.
Marc Ladanyi, M.D.
Cameron W. Brennan

* equal contributing authors

Abstract of submitted paper:

Intragenic deletion is the most common form of activating mutation among receptor
tyrosine kinases (RTK) in glioblastoma. However, these events are not detected by conventional
DNA sequencing methods commonly utilized for tumor genotyping. We analyzed RNA from a
set of 192 glioblastoma samples from The Cancer Genome Atlas for the expression of
glioblastoma-associated intragenic RTK deletions, including EGFRvIII, EGFRvII, EGFRvV
(Carboxyl-terminal deletion), and PDGFRA delta 8,9. These mutations were detected in 24, 1.6, 4.7,
and 1.6% of cases, respectively. Overall, 29% (55/189) of glioblastomas expressed at least one
RTK intragenic deletion transcript in this panel. For EGFRvIII, samples were analyzed by both
quantitative real time PCR (QRT-PCR) and single mRNA molecule counting on the Nanostring
nCounter platform. Nanostring proved to be highly sensitive, specific, and linear, with sensitivity
comparable or exceeding that of RNA seq. We evaluated the prognostic significance and
molecular correlates of RTK rearrangements. EGFRvIII was only detectable in tumors with
focal amplification of the gene. Contrary to prior reports, we found that EGFRvIII expression was
not prognostic of poor outcome and that neither recurrent copy number alterations nor changes
in gene expression differentiate EGFRvIII-positive tumors from tumors with amplification of
wildtype EGFR. The wide range of expression of mutant alleles and co-expression of multiple
EGFR variants suggests that quantitative RNA-based clinical assays will be important for testing
intragenic deletions as therapeutic targets and/or candidate biomarkers. To this end, we
demonstrate the performance of the Nanostring assay in RNA derived from routinely collected
formalin-fixed paraffin embedded (FFPE) tissue.



Accompanying data files:


"RT-PCR" directory:

(Quantitative Reverse Transcriptase PCR)

Level 1 data:

File name:  "TCGA_GBM_EGFRvIII_RTPCR_dataTable_Level1.xlsx" 

(following from submitted paper's materials and methods section)

"From the TCGA sample set, 275 cases with available RNA were interrogated for relative
expression of wild type EGFR and EGFRvIII by RT-PCR. 400ng of total RNA was reversetranscribed
using the Thermoscript RT-PCR system (Invitrogen) at 52 C for 1 hour. 20ng of
resultant cDNA was used in a Q-PCR reaction using an 7500 Real-Time PCR System (Applied
Biosystems) and custom designed TaqMan gene expression Assays (EGFRvIII Forward primer:
5CGGGCTCTGGAGGAAAAG3; EGFRvIII reverse primer: 5AGGCCCTTCGCACTTCTTAC3;
EGFRvIII internal primer: 5GTGACAGATCACGGCTCGTG3; total EGFR: pre-designed
TaqMan ABI Gene expression Assays Hs01076076_m1). Primers were chosen based on their
ability to span the most 3' exon-exon junction. Amplification was carried for 40 cycles (95C for
15sec, 60C for 1min). To calculate the efficiency of the PCR reaction, and to assess the
sensitivity of each assay, we also performed a 6 point standard curve (5, 1.7,0.56,0.19,0.062,
and 0.021ng)."

Level 2 data:

File name: "TCGA_GBM_EGFRvIII_RTPCR_dataTable_Level2.txt"

(following from submitted paper's materials and methods section)

"Triplicates CT values were averaged, amounts of target were interpolated from
the standard curves and normalized to TBP (TATA box binding protein pre-designed TaqMan
ABI Gene expression Assays Hs00427620_m1 ). Efficiency of each reaction was determined
from the standard curve of a serially diluted sample using the equation: Efficiency=10exp[(-1/slope)]-1,
where slope is fitted to CT vs. log10(concentration). Relative quantities of TBP, EGFR and
EGFRvIII were calculated from each CT[i] based on the reaction efficiencies and minimum CTs
from the standard dilution curves (CTmax) according to the formula:

Quantity=(1+Efficiency)exp[(CTmax-CT)]. 

All reactions were performed in triplicate. Samples were
rejected if multiple TBP replicates failed to cross threshold in less than 36 cycles or if the
median absolute deviation of quantified TBP across replicates was greater than 25% (5 of 275
samples). The relative quantities of EGFR and EGFRvIII were normalized with respect to TBP."



"Nanostring" directory:

Level 1 data:


Directory "TCGA_GBM_Nanostring_rawFiles_Level1" and its contents

Raw output files from Nanostring instrument


File name: "TCGA_GBM_Nanostring_annotationTable.txt"

Associated metadata used to perform the Nanostring experiments


File name: "TCGA_GBM_Nanostring_dataTable_Level1.txt"

(following from submitted paper's materials and methods section)

"The nCounter Analysis System (Nanostring Technologies, Seattle, WA) allows for
multiplexed digital mRNA profiling without amplification or generation of cDNA [13]. Briefly, a
pair of approximately 50 bp probes complementary to each target mRNA are hybridized. The
reporter probe is tagged by a target-specific code of four fluorescent reporters at seven
positions along a phage DNA backbone. The capture probe is used for immobilization on a
slide and once oriented in an electric field, bound reporters are counted and annotated. A
custom Nanostring probe set was designed as detailed in Table S1. Total RNA (150-300ng)
was hybridized with the codeset probes and loaded into the nCounter prep station. The
samples were quantified using the nCounter Digital Analyzer.
The Nanostring platform includes negative control probes (not complementary to any
endogenous mRNA) to assess background noise associated with the fluorescent barcode
optical recognition system. To ensure that all samples were with the optimal range of probe
density for image analysis, we confirmed that there was no systemic increase in negative
control counts as a function of total number of counts recorded per sample."

[13] Geiss GK, Bumgarner RE, Birditt Bet al. (2008) Direct multiplexed measurement of
gene expression with color-coded probe pairs. Nat Biotechnol 26: 317-325 Doi
10.1038/nbt1385


File name: "Supplementary_Table_S1.pdf"

Details of custom Nanostring probe set designed for above.



Level 2 data:


File name: "TCGA_GBM_Nanostring_dataTable_Level2.txt"

(following from submitted paper's materials and methods section)

"Raw probe counts were normalized to a panel of 8 control genes (B2M, B4GALT1, CLTC, E2F4, GAPDH,
POLR2A, SDHA, and TBP) by taking the ratios of each genes counts per sample to the
average across all samples and scaling by the median of these ratios.in each sample. This
normalization factor was also applied to the negative control probes counts. A detection
threshold was defined for each sample as 5 times the mean of the negative control probe
normalized counts. Of 192 samples run, three cases (TCGA.1386, TCGA.0827 and
TCGA.0021) were excluded from analysis as outliers with low expression of the 8 control genes
(possibly representing underloading or poor hybridization).

C-terminal deletion mutation was inferred by the occurrence of relative underexpression
(undercounting) of the exon 28 probe vs. the exon 19 probe. The normal (wild-type) linear
relationship of counts between these two probes was determined by a linear model fit to the
central 90% of the data. Then this model was applied to the entire dataset to identify cases with
outlier c-terminal underexpression. These cases fell into in two groups: intermediate expression
of the truncation mutant (less than 60% of expected c-terminal counts), or high expression (less
than 10%)."



"Kastenhuber2013" directory:


Level 3 data:

File name: "Kastenhuber2013_dataTable_Summary.txt"

Summarization of results based on specific methods for determining cutoffs.


File name: "Kastenhuber2013_Exome.Tumor_EGFRvIII_coverage.txt"

(following from submitted paper's materials and methods section)

"RNA and DNA Sequence Analysis
RNA and DNA sequencing data (BAM files mapped to hg19) were obtained from TCGA
through CGHub. RNA sequencing was analyzed to tabulate EGFR and PDGFRA exon
junctions as described [4]. Briefly, counts were made of all EGFR and PDGFRA reads spanning
exon-exon junctions and all paired exonic reads with gaps spanning one or more introns. Only
reads with perfect alignment scores (CIGAR score) were considered. To account for 3 bias in
RNA sequence representation, mutant junction counts were compared with counts of normal
junctions at the 3 exon. For example, EGFRvIII expression was defined by counting reads with
E1-E8 junctions and comparing to the count of reads with wild type E7-E8 junctions. EGFRvII
was defined by E13-E16 vs. wild type E15-E16. PDGFRA D89 was defined by E7-E10 vs.
wildtype E9-E10. A junction was counted only if seen in more than one read. Exome DNA
sequence data for 291 tumors was analyzed to determine read coverage within the EGFR gene
in two regions: exons 2-7 (the EGFRvIII deleted region) and exons 8-22 (spanning the
transmembrane and kinase domain regions). The normal ratio of counts between regions was
determined by linear regression fit of the middle 90% of ratios. This model was applied to
normalize the ratios and allow accurate estimation of relative copy number of exons 2-7 vs.
exons 8-22."

[4] Brennan CW, Verhaak R, McKenna Aet al. (2013) The Somatic Genomic Landscape of
GBM. Cell Submitted

