MAGE-TAB Version	1.1									
Investigation Title	Characterization of the Transcriptome in Wilms Tumor TARGET Cohort									
Experimental Design	disease state design	transcript identification design	is expressed design							
Experimental Design Term Source REF	EFO	EFO	EFO							
Experimental Factor Name	MLLT1 variant									
Experimental Factor Type	genetic variation									
Experimental Factor Term Source REF	EFO									
Person Last Name	NCI Office of Cancer Genomics (OCG)	NCI Center for Biomedical Informatics and Information Technology (CBIIT)	Gadd	Perlman						
Person First Name			Samantha	Elizabeth						
Person Mid Initials			L	J						
Person Email	ocg@mail.nih.gov	ncicbiit@mail.nih.gov	sgadd@luriechildrens.org	eperlman@luriechildrens.org 						
Person Phone	+1 301 451 8027	+1 888 478 4423	773-755-6392	312-227-3967						
Person Fax	+1 301 480 4368									
Person Address	31 Center Dr, Rm 10A07, Bethesda, MD 20892	9609 Medical Center Dr, Rockville, MD 20850	2430 N Halsted St, Room C366, Chicago, IL 60614	225 E Chicago Avenue, Chicago, IL 60611						
Person Affiliation	National Cancer Institute	National Cancer Institute	Lurie Children's Hospital of Chicago Research Center  	Ann &amp; Robert H. Lurie Children's Hospital of Chicago						
Person Roles	funder	data coder;curator	submitter;data analyst	investigator						
Person Roles Term Source REF	EFO	EFO;EFO	EFO	EFO						
Quality Control Type										
Quality Control Term Source REF										
Replicate Type										
Replicate Term Source REF										
Normalization Type										
Normalization Term Source REF										
Date of Experiment										
Public Release Date										
PubMed ID										
Publication DOI										
Publication Author List										
Publication Title										
Publication Status										
Publication Status Term Source REF										
Experiment Description	The NCI TARGET initiative seeks to identify therapeutic targets for high-risk pediatric tumors through genomic sequencing supported by copy number, gene expression, and epigenetic analyses. The aim of this study was to use the AffyMetrix HG-U133_Plus_2 and HG-U133A arrays to determine the transcriptomic status of Wilms Tumor to characterize the gene expression pattern of these tumors and for use in subsequent integrative analyses.									
Protocol Name	nationwidechildrens.org:Protocol:RNA-Extraction:01	luriechildrens.org:Protocol:RNA-Extraction-TRIzol:01	luriechildrens.org:Protocol:GeneExpressionArray-Labeling-Affymetrix-IVTExpress:01	luriechildrens.org:Protocol:GeneExpressionArray-Hybridization-Affymetrix-U133Plus2:01	luriechildrens.org:Protocol:GeneExpressionArray-Hybridization-Affymetrix-U133A:01	luriechildrens.org:Protocol:GeneExpressionArray-Scanning-Affymetrix-U133Plus2:01	luriechildrens.org:Protocol:GeneExpressionArray-Scanning-Affymetrix-U133A:01	luriechildrens.org:Protocol:GeneExpressionArray-DataNormalization-RMA:01	luriechildrens.org:Protocol:GeneExpressionArray-CollapseDataset:01	luriechildrens.org:Protocol:GeneExpressionArray-DGE-SAM:01
Protocol Type	nucleic acid extraction protocol	nucleic acid extraction protocol	nucleic acid labeling protocol	nucleic acid hybridization to array protocol 	nucleic acid hybridization to array protocol 	array scanning protocol	array scanning protocol	normalization data transformation protocol	data transformation protocol	data transformation protocol
Protocol Term Source REF	EFO	EFO	EFO	EFO	EFO	EFO	EFO	EFO	EFO	EFO
Protocol Description	"RNA was extracted from tumor samples at  Nationwide Children's BioPathology Center (BPC) by using the standard BPC protocol. RNA quality was assessed by a bioanalyzer and RNA samples were required to have a RIN &gt; 7. Total RNA was provided to Lurie Children's Hospital Research Center at a concentration of 150 ng/ul (2 ug total) in sets of 16 samples. One WT sample for which sufficient column-purified RNA was available was selected to serve as a control sample (PAJMLZ). Each set of 16 samples received from the BPC included the WT control sample, which was therefore repeated throughout all steps of this procedure in order to ensure consistency among all steps. "	"RNA was extracted from tumor samples at Lurie Children's Hospital Research Center by Trizol extraction and purified by precipitation and Rneasy Mini Kit. RNA quality was assessed by optical density and gel electrophoresis and RNA samples were required to have an OD 260/280 of &gt;1.8. "	"250 ng of total RNA was labeled by using the Affymetrix GeneChip 3' IVT Express Kit at Lurie Children's Hospital Research Center.  All procedures, including 1st strand reverse transcription, 2nd strand synthesis, in vitro transcription of aRNA, aRNA purification, quantitation, and fragmentation were performed according to the manufacturer's protocol."	"Nucleic acid hybridization to the HG-U133_Plus_2 array was performed at Lurie Children's Hospital Research Center by using the AffyMetrix GeneChip Hybridization, Wash and Stain Kit per the manufacturer's instructions."	"Nucleic acid hybridization to the HG-U133A array was performed at Lurie Children's Hospital Research Center by using the AffyMetrix GeneChip Hybridization, Wash and Stain Kit per the manufacturer's instructions."	"The HG-U133_Plus_2 arrays were scanned at Lurie Children's Hospital Research Center by using the Gene-Chip Operating Software (GCOS).  Each .dat file was visually inspected for large scratches and/or misalignment of the grid. "	"The HG-U133A arrays were scanned at Lurie Children's Hospital Research Center by using the Gene-Chip Operating Software (GCOS).  Each .dat file was visually inspected for large scratches and/or misalignment of the grid. "	"Gene-Chip Operating Software (GCOS) was used to generate .chp files, which represent the consolidation of all individual probes within a probeset, from .cel files. From .chp files, GCOS was used to generate .rpt files, which show probe intensity values and QC values. All samples were inspected for several parameters. Background &lt; 45 (actual range: 28.19–43.18). Noise (Raw Q) &lt; 1.35 (actual range: 0.670–1.30). Scaling Factor &lt; 65% (actual range: 11.487–52.965). % Present call &gt; 35% (actual range: 38.4–57.7). 3'/5' GAPDH &lt; 3.92 (actual range: 0.95–3.48). Samples with parameters outside of these limits were rerun starting at the step of RNA labeling. All .cel files were imported into the Broad Institute’s GenePattern server and Robust Multichip Average (RMA) normalization was performed using the ExpressionFileCreator module. Data were exported as a single .txt file containing probeset information for each individual tumor within a single spreadsheet. Several analytic quality control steps were performed. Principle component analysis (PCA) was performed to ensure that none of the samples were outliers. Pair-wise correlation coefficient analysis was performed using the data from the WT control sample that was included in each individual batch of samples. The normalized averages of the expression levels from each WT control run showed a correlation coefficient &gt; 98%, indicating a high level of consistency. Six probesets corresponding to five genes were identified that closely correlated with gender (four male genes [RPS4Y1, DDX3Y, SMCY, and EIF1AY] and one female gene [XIST]). All samples were classified as male or female according to the expression patterns of these genes and the results were checked against the known gender of the patient. No discrepancies were detected."	"For analyses, 9/10 replicates for PAJMLZ were removed from the RMA gene expression file. A collapsed data file was created by using the Broad Insitute's GenePattern CollapseDataset module with the default parameters and the maximum probe collapse method."	"SAM was used to compare gene expression in 51 tumors: favorable histology WT (FHWT) sequenced at CGI with the MLLT1 variant (5) vs  the remainder of FHWT sequenced at CGI that do not have the MLLT1variant (46). Gene expression data is not available for 1 FHWT with the MLLT1 variant. SAM was run using the Level 2 gene expression data. First, probesets that had absent "A" calls for 95% (48) or more samples were filtered out, resulting in the retention of 39913 probesets for analysis. The data were log transformed prior to running SAM. Two class unpaired analysis was run using 200 permutations; probesets with q &lt; 0.05 were retained."
Protocol Parameters										
Protocol Hardware										
Protocol Software										
Protocol Contact										
SDRF File	TARGET_WT_GeneExpressionArray_20160831.sdrf.txt									
Term Source Name	NCBITaxon	NCIt	MO	EFO	OBI					
Term Source File	http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html	http://ncit.nci.nih.gov/	http://mged.sourceforge.net/ontologies/MGEDontology.php	http://www.ebi.ac.uk/efo	http://purl.obolibrary.org/obo/obi					
Term Source Version										