MAGE-TAB Version	1.1
Investigation Title	TARGET: Ambiguous Lineage Acute Leukemia (ALAL) Phase III WGS
Experimental Design	disease state design
Experimental Design Term Source REF	EFO
Experimental Factor Name
Experimental Factor Type
Experimental Factor Term Source REF
Person Last Name	NCI Office of Cancer Genomics (OCG)	NCI Center for Biomedical Informatics and Information Technology (CBIIT)	Hunger	Mullighan	Loh	Ma	Novik
Person First Name			Stephen	Charles	Mignon	Yussanne	Karen
Person Mid Initials			P			P	L
Person Email	ocg@mail.nih.gov	ncicbiit@mail.nih.gov	hungers@chop.edu	charles.mullighan@stjude.org	lohm@peds.ucsf.edu	yma@bcgsc.ca	knovik@bcgsc.ca
Person Phone	+1 301 451 8027	+1 888 478 4423		+1 901 595 3387	+1 415 476 3831	+1 604 707 5800 Ext 6082	+1 604 707 8000 Ext 7983
Person Fax	+1 301 480 4368			+1 901 595 5947		+1 604 876 3561	+1 604 675 8178
Person Address	31 Center Dr, Rm 10A07, Bethesda, MD 20892	9609 Medical Center Dr, Rockville, MD 20850	3401 Civic Center Blvd Philadelphia, PA 19104	262 Danny Thomas Place, Mail Stop 342, Memphis TN 38105	Box 0106, UCSF	Suite 100-570 West 7th Ave, Vancouver, BC Canada V5Z 4S6	675 West 10th Ave Vancouver, BC Canada V5Z 1L3
Person Affiliation	National Cancer Institute	National Cancer Institute	Children's Hospital of Philadelphia	St Jude Children's Research Hospital	UCSF Benioff Children's Hospital	BC Cancer Agency Canada's Michael Smith Genome Sciences Centre	BC Cancer Agency Canada's Michael Smith Genome Sciences Centre
Person Roles	funder;investigator	data coder;curator	investigator	investigator	investigator	investigator;data analyst;submitter	investigator
Person Roles Term Source REF	EFO;EFO	EFO;EFO	EFO	EFO	EFO	EFO;EFO;EFO	EFO
Quality Control Type
Quality Control Term Source REF
Replicate Type
Replicate Term Source REF
Normalization Type
Normalization Term Source REF
Date of Experiment
Public Release Date
PubMed ID
Publication DOI
Publication Author List
Publication Title
Publication Status
Publication Status Term Source REF
Experiment Description	"There are 175 fully characterized patient cases with relapsed precursor B-cell ALL (all tumor/normal pairs, 85 with relapse sample as well) that will make up Phase II of the TARGET ALL dataset, each with gene expression, tumor and paired normal copy number analyses, and comprehensive next-generation sequencing to include whole genome sequencing, mRNA-seq and miRNA-seq. Subsets of these cases will also have methylation and/or whole exome sequencing data available as well. There are additionally a large number of cases with partial molecular characterization making this a large and informative genomic dataset. All cases can be sorted according to data type via the Case Matrix on the TARGET Data Matrix. Please visit the TARGET website listed above for additional information on this and other TARGET genomics projects. Please see the TARGET Publication Guidelines at the OCG websitefor updated details on sharing of any TARGET substudy data."
Protocol Name	stjude.org:Protocol:DNA-Extraction-Qiagen-QIAamp:01	bcgsc.ca:Protocol:WGS-LibraryPrep-Illumina:01	bcgsc.ca:Protocol:WGS-Sequence-Illumina-HiSeq2500:01	bcgsc.ca:Protocol:WGS-BaseCall-Illumina:01	bcgsc.ca:Protocol:WGS-ReadAlign-BWA-Picard:01	bcgsc.ca:Protocol:WGS-StructVariant-DELLY:01	bcgsc.ca:Protocol:WGS-StructVariant-ABySS:02	bcgsc.ca:Protocol:WGS-VariantCall-Strelka:01	bcgsc.ca:Protocol:WGS-CombineSomaticSnvs:01	bcgsc.ca:Protocol:WGS-Mpileup-Vcf2Tab:01	bcgsc.ca:Protocol:WGS-StructVariant-GenomeValidator:01	bcgsc.ca:Protocol:WGS-Strelka-Vcf2Tab-Snv:01	bcgsc.ca:Protocol:WGS-VariantCall-Mpileup:01	bcgsc.ca:Protocol:WGS-Strelka-Vcf2Tab-Indel:01	bcgsc.ca:Protocol:WGS-VariantCall-Mpileup-MutationSeq:01
Protocol Type	nucleic acid extraction protocol	nucleic acid library construction protocol	nucleic acid sequencing protocol	data transformation protocol	data transformation protocol	data transformation protocol	data transformation protocol	data transformation protocol	data transformation protocol	data transformation protocol	data transformation protocol	data transformation protocol	data transformation protocol	data transformation protocol	data transformation protocol
Protocol Term Source REF	EFO	EFO	EFO	EFO	EFO	EFO	EFO	EFO	EFO	EFO	EFO	EFO	EFO	EFO	EFO
Protocol Description	"Genomic DNA was prepared from using the Qiagen QIAamp DNA Mini Kit (Qiagen, Valencia, CA, USA). Please see https://ocg.cancer.gov/programs/target/target-methods for full extraction protocol details."	"Genomic DNA for construction of whole genome shotgun sequencing (WGSS) libraries was prepared from the same biopsy material using the Qiagen AllPrep DNA/RNA Mini Kit (Qiagen, Valencia, CA, USA). DNA quality was assessed by spectrophotometry (260/280 and 260/230) and gel electrophoresis before library construction. Depending on the availability of DNA, between 2 and 10ug was used in WGSS library construction. Briefly, DNA was sheared for 10 min using a Sonic Dismembrator 550 with a power setting of "7" in pulses of 30 seconds interspersed with 30 seconds of cooling (Cup Horn, Fisher Scientific, Ottawa, Ontario, Canada), and analyzed on 8% PAGE gels. The 200-300bp DNA fraction was excised and eluted from the gel slice overnight at 4 degrees Celsius in 300 ul of elution buffer (5:1, LoTE buffer (3 mM Tris-HCl, pH 7.5, 0.2 mM EDTA)-7.5 M ammonium acetate), and was purified using a Spin-X Filter Tube (Fisher Scientific), and by ethanol precipitation. WGSS libraries were prepared using a modified paired-end protocol supplied by Illumina Inc. (Illumina, Hayward, USA). This involved DNA end-repair and formation of 3' A overhangs using Klenow fragment (3' to 5' exo minus) and ligation to Illumina PE adapters (with 5' overhangs). Adapter-ligated products were purified on Qiaquick spin columns (Qiagen, Valencia, CA, USA) and PCR-amplified using Phusion DNA polymerase (NEB, Ipswich, MA, USA) and 10 cycles with the PE primer 1.0 and 2.0 (Illumina). PCR products of the desired size range were purified from adapter ligation artifacts using 8% PAGE gels. DNA quality was assessed and quantified using an Agilent DNA 1000 series II assay (Agilent, Santa Clara CA, USA) and Nanodrop 7500 spectrophotometer (Nanodrop, Wilmington, DE, USA) and DNA was subsequently diluted to 10nM. The final concentration was confirmed using a Quant-iT dsDNA HS assay kit and Qubit fluorometer (Invitrogen, Carlsbad, CA, USA)"			"Illumina paired-end whole genome sequencing reads were aligned to the hg19 reference using BWA version 0.5.7. This reference contains chromosomes 1-22, X, Y, MT, 20 unlocalized scaffolds and 39 unplaced scaffolds. Multiple lanes of sequences were merged and duplicated reads were marked with Picard Tools."	"DELLY: structural variant discovery by integrated paired-end and split-read analysis. PMID:22962449"	"Structural variant detection was performed using ABySS (v1.3.4) (Simpson et al, PMID:19251739). Genome (WGS) libraries were assembled in single end mode using k-mer values of k56 and k76. The contigs and reads were then reassembled at k96 in single end mode and then finally at k96 in paired end mode. The meta-assemblies were then used as input to the trans-ABySS analysis pipeline (Robertson et al PMID: 20935650). Large scale rearrangements and gene fusions from RNA-seq libraries were identified from contigs that had high confidence GMAP (v2015-06-12) alignments to two distinct genomic regions. Evidence for the alignments were provided from aligning reads back to the contigs and from aligning reads to genomic coordinates. Events were then filtered based on the number and types of supporting reads. Large scale rearrangements and gene fusions from WGS libraries were identified in a similar way, using BWA (v0.7.12r1039) alignments. Insertions and deletions were identified by gapped alignment of contigs to the human reference using BWA for WGS. Confidence in the event was calculated from the alignment of reads back to the event breakpoint in the contigs. The events were then screened against dbSNP and other variation databases to identify putative novel events."	"To analyze compartment specific SNVs, samples were analyzed pair wise with the default settings of Strelka v0.4.7 (Saunders et al., 2012). Primary tumor samples and relapse/met were compared against the germline sample. In the absence of a germline sample, the relapse/met samples were compared against the primary tumor sample."			"An in-house tool, Genome Validator was used to determine compartment specific events. The structural variant calls for each patient from matched genome and RNA-seq samples were concatenated together and screened for each patient against matching tumour and germline alignments. This resulted in compartment specific structural variant events and putative somatic calls. The events were further filtered against a compendium of recurrent structural variants to remove recurrent false positives."		"SNVs were analyzed with SAMtools mpileup v.0.1.17 (Li et al., 2009) either on single or paired libraries. Each chromosome was analyzed separately using the -C50-DSBuf parameters. The resulting vcf files were merged and filtered to remove low quality SNVs by using samtools varFilter (with default parameters) as well as to remove SNVs with a QUAL score of less than 20 (vcf column 6). Finally, SNVs were annotated with gene annotations from ensembl v66 using snpEff (Cingolani et al., 2012b) and the dbSNP v137 db membership assigned using snpSift (Cingolani et al., 2012a)."		"SNVs were analyzed pair wise with SAMtools mpileup v.0.1.17 (Li et al., 2009). Each chromosome was analyzed separately using the -C50-DSBuf parameters. Before merging the resulting vcf files, they were filtered to remove all indels and low quality SNVs by using samtools varFilter (with default parameters) as well as to remove SNVs with a QUAL score of less than 20 (vcf column 6). The SNVs in the resulting vcf files were further filtered and scored using mutationSeq v1.0.2 and annotated with gene annotations from ensembl v66 using snpEff (Cingolani et al., 2012b) and the dbSNP v137 and cosmic 64 db membership using snpSift (Cingolani et al., 2012a)."
Protocol Parameters				Software Versions											
Protocol Hardware			Illumina HiSeq 2500												
Protocol Software				Illumina RTA											
Protocol Contact
SDRF File	TARGET_ALL_WGS_Phase3_20181025.sdrf.txt
Term Source Name	NCBITaxon	NCIt	MO	EFO	OBI
Term Source File	http://www.ncbi.nlm.nih.gov/taxonomy	http://ncit.nci.nih.gov/	http://mged.sourceforge.net/ontologies/MGEDontology.php	http://www.ebi.ac.uk/efo	http://purl.obolibrary.org/obo/obi
Term Source Version
Comment[SRA_STUDY]	SRP011999
Comment[BioProject]	PRJNA89529
Comment[dbGaP Study]	phs000464
