MAGE-TAB Version	1.1										
Investigation Title	Characterization of copy number status and allelic events in Wilms Tumor TARGET cohort										
Experimental Design	genotyping design	disease state design									
Experimental Design Term Source REF	EFO	EFO									
Experimental Factor Name											
Experimental Factor Type											
Experimental Factor Term Source REF											
Person Last Name	NCI Office of Cancer Genomics (OCG)	NCI Center for Biomedical Informatics and Information Technology (CBIIT)	Gadd	Perlman							
Person First Name			Samantha	Elizabeth							
Person Mid Initials			L	J							
Person Email	ocg@mail.nih.gov	ncicbiit@mail.nih.gov	sgadd@luriechildrens.org	eperlman@luriechildrens.org 							
Person Phone	+1 301 451 8027	+1 888 478 4423	773-755-6392	312-227-3967							
Person Fax	+1 301 480 4368										
Person Address	31 Center Dr, Rm 10A07, Bethesda, MD 20892	9609 Medical Center Dr, Rockville, MD 20850	2430 N Halsted St, Room C366, Chicago, IL 60614	225 E Chicago Avenue, Chicago, IL 60611							
Person Affiliation	National Cancer Institute	National Cancer Institute	Lurie Children's Hospital of Chicago Research Center  	Ann &amp; Robert H. Lurie Children's Hospital of Chicago							
Person Roles	funder	data coder;curator	submitter;data analyst	investigator							
Person Roles Term Source REF	EFO	EFO;EFO	EFO	EFO							
Quality Control Type											
Quality Control Term Source REF											
Replicate Type											
Replicate Term Source REF											
Normalization Type											
Normalization Term Source REF											
Date of Experiment											
Public Release Date											
PubMed ID											
Publication DOI											
Publication Author List											
Publication Title											
Publication Status											
Publication Status Term Source REF											
Experiment Description	The NCI TARGET initiative seeks to identify therapeutic targets for high-risk pediatric tumors through genomic sequencing supported by copy number, gene expression, and epigenetic analyses. The aim of this study was to use the AffyMetrix 6.0 SNP chip to determine copy number status and allelic events in Wilms Tumor to characterize the genome of these tumors and for use in subsequent integrative analyses.										
Protocol Name	nationwidechildrens.org:Protocol:DNA-Extraction:01	stjude.org:Protocol:CopyNumberArray-Labeling-Affymetrix-SNP6:01	completegenomics.com:Protocol:WGS-LibraryPrep:01	stjude.org:Protocol:CopyNumberArray-Hybridization-Affymetrix-SNP6:01	stjude.org:Protocol:CopyNumberArray-Scanning-Affymetrix-SNP6:01	completegenomics.com:Protocol:WGS-Sequence-CGI-CGI:01	completegenomics.com:Protocol:WGS-BaseCall-CGI:01	nci.nih.gov:CBIIT.Meerzaman.Protocol:CopyNumberArray-DataNormalization-Birdseed:01	completegenomics.com:Protocol:WGS-ReadAlign:01	nci.nih.gov:CBIIT.Meerzaman.Protocol:CopyNumberArray-CnvSegment-DNAcopy:01	nci.nih.gov:CBIIT.Meerzaman.Protocol:WGS-CnvSegment-DNAcopy:01
Protocol Type	nucleic acid extraction protocol	nucleic acid labeling protocol		nucleic acid hybridization to array protocol 	array scanning protocol			normalization data transformation protocol		data transformation protocol	
Protocol Term Source REF	EFO	EFO		EFO	EFO			EFO		EFO	
Protocol Description	"DNA was extracted from normal kidney, tumor, or blood  samples at Nationwide Children's BioPathology Center (BPC) by using the standard BPC protocol. Pico green analysis was performed to verify concentration of gDNA.  Spectrophotometry was performed to verify DNA purity and gel electrophoresis was performed to verify DNA quality. Tumor and corresponding normal specimens (blood and/or normal kidney) were supplied to St. Jude Children's Research Hospital on 96-well plates allowing for the inclusion of two controls. "	"Nucleic acid labeling was performed according to the manufacturer's protocol for the AffyMetrix 6.0 SNP array at St Jude's Children's Research Hospital. "	"Library preparation was performed at CGI according to the standard CGI protocol."	"Nucleic acid hybridization was performed according to the manufacturer's protocol for the AffyMetrix 6.0 SNP array at St Jude's Children's Research Hospital. "	"The array scanning protocol was performed according to the manufacturer's protocol for the AffyMetrix 6.0 SNP array at St Jude's Children's Research Hospital. "	"Whole genome sequencing was performed at CGI according to the standard CGI protocol. "	"Base calls and quality scores performed by CGI according to the standard CGI protocol. "	"Data were provided by  St Jude's Children's Research Hospital in the Affymetrix CEL file format and the CEL files were processed using AffyMetrix Genotyping Console (GTC) 4.0 software to generate corresponding Birdseed .chp and .txt files by using the Birdseed v2 algorithm with the default parameters. Several quality control parameters were used. Contrast QC: The average contrast QC was 1.83 for all samples, which is above the minimal of 1.7 recommended by AffyMetrix. Less than 10% of samples had a Contrast QC &lt;0.4, and those samples with contrast QC &lt;0.4 were deemed acceptable based on their heterozygosity values and Birdseed call rates. DNA gender check: Samples were classified into genders using AffyMetrix Genotyping Console software; no inconsistencies were noted. Only 0.03% of all samples could not be classified according to gender (“unknown”); all of these samples were tumor samples in which the gender of the corresponding normal sample was called correctly. Sample Call Rate:  AffyMetrix GTC 4.0 software was used to check the calling rate of constitutional DNA samples and all samples had calling rates greater than the cut-off of &gt;95.5% (range, 94.1–99.5%; mean, 97.9%).  Furthermore, the calling rate of tumor samples ranged from 93.4–97.3% (mean, 97.4%). DNA Autosomal Heterozygosity rate: The percentage of heterozygous SNPs among all measured SNPs was determined per sample using AffyMetrix GTC 4.0 software. The heterozygosity rates of normal samples ranged from 24–32%, which is within normal limits. This rate, which is expected to be lower for tumors compared to normals, ranged from 15–32% in our tumor samples. Normalization: The reference normalization procedure utilized for our data normalization relies on an algorithm developed at St. Jude that utilizes a diploid chromosome for each sample to guide data normalization, as described by Pounds et al. (Bioinformatics. 2009 Feb 1;25(3):315-21). In the first step, the CEL files and Birdseed.txt files are read into dChip and model-based expression analysis (MBEI) is performed to generate probe level summarization values for each individual probe. This results in a file containing two columns for each individual sample: (1) the summarized probe value and (2) the genotype call. This file containing un-normalized data is exported from dChip as a text file and imported into R for reference normalization according to Pounds et al. This algorithm requires two input files: (1) the dChip output file described above and (2) a text file defining each SNP on the AffyMetrix 6.0 chip according to chromosome and location. The reference chromosome for each sample was selected by using Nexus 6.0 software.  The reference normalization algorithm provides an output text file containing two columns for each sample: (1) the normalized probe value and (2) the genotype call."	"CGI standard protocol FASTQ -&gt; BAM sequence alignment"	"Circular binary segmentation (CBS) was then applied to the output files in order to obtain segmented copy number information. This was performed in R using the DNAcopy BioConductor package. First, the log (base=2) of the ratios of each tumor sample's signal values over the signal values of the corresponding normal samples was calculated. After detecting outliers and smoothing the log ratio signal data, CBS was applied to segment the data into regions of estimated equal copy number. CBS was performed using default parameters including nperm = 10,000, alpha=0.01,undo.splits=sdundo, and undo.SD=1. This algorithm resulted in a segmented file for each tumor sample relative to the corresponding normal sample.   "	"Copy number data were calculated for the tumors based on the CGI relative coverage. Genome-wide coverage was smoothed in 100-kb windows, corrected for the GC content, and normalized using composite baseline coverage from multiple healthy samples. CNV levels were called using a Hidden Markov Model (HMM). The relativeCvg  is defined as avgNormalizedCvg divided by estimate of diploid median normalized adjusted coverage. Circular binary segmentation (CBS) was then applied to the relative coverage files in order to obtain segmented copy number information. This was performed in R using the DNAcopy BioConductor package. The log (base=2) value was used as the input for each tumor sample. CBS was applied to segment the data into regions of estimated equal copy number. CBS was performed using default parameters including nperm = 10,000, alpha=0.01,undo.splits=sdundo, and undo.SD=1. This algorithm resulted in a segmented file for each tumor sample relative to the corresponding normal sample.   "
Protocol Parameters											
Protocol Hardware											
Protocol Software											
Protocol Contact											
SDRF File	TARGET_WT_CopyNumberArray_20160831.sdrf.txt										
Term Source Name	NCBITaxon	NCIt	MO	EFO	OBI						
Term Source File	http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html	http://ncit.nci.nih.gov/	http://mged.sourceforge.net/ontologies/MGEDontology.php	http://www.ebi.ac.uk/efo	http://purl.obolibrary.org/obo/obi						
Term Source Version											