allele-specific.Rmd
CAMDAC can be used to detect allele-specific methylation (ASM) by phasing CpGs to heterozygous SNPs and deconvolving bulk methylation rates per allele.
This tutorial steps through the ASM analysis pipeline. Briefly:
Results from this pipeline are found in the results directory under ‘PATIENT/AlleleSpecific’ and ‘PATIENT/Methylation’. See output file headings below for files and their content.
The asm_pipeline()
function runs CAMDAC-ASM analysis by generates the allele-specific copy number solution and heterozygous SNP loci, followed by deconvolution and differential ASM analysis:
b_tumor <- system.file("testdata", "tumor.bam", package = "CAMDAC")
b_normal <- system.file("testdata", "normal.bam", package = "CAMDAC")
regions <- system.file("testdata", "test_wgbs_segments.bed", package = "CAMDAC") # speed up tests
tumor <- CamSample(id = "T", sex = "XY", bam = b_tumor)
normal <- CamSample(id = "N", sex = "XY", bam = b_normal)
config <- CamConfig(
outdir = "./results", ref = "./pipeline_files", bsseq = "wgbs", lib = "pe", cores = 10,
min_cov = 1, # For test data
regions = regions
)
asm_pipeline(
tumor = tumor,
germline = normal,
infiltrates = normal,
origin = normal,
config = config
)
To run the ASM pipeline without BAM files, CAMDAC requires: - Each CamSample object has SNP loci - The tumor CamSample object has an allele-specific CNA solution - All CamSample objects have BAM files available for phasing
CAMDAC-ASM requires a file of heterozygous SNP loci against which CpGs will be phased. This is a tab-delimited file with a header containing four fields:
Field | Description |
---|---|
chrom | Chromosome name |
pos | SNP loci position |
ref | The reference allele (A/C/T/G) |
alt | The alternate SNP allele (A/C/T/G) |
First, attach your SNP loci file to the tumor object with attach_output()
, then run asm_pipeline()
:
# Setup CAMDAC samples
tumor <- CamSample(id = "tumor", sex = "XY", bam = b_tumor)
normal <- CamSample(id = "normal", sex = "XY", bam = b_normal)
config <- CamConfig(
outdir = "./results", ref = "./pipeline_files", bsseq = "wgbs", lib = "pe", cores = 10,
min_cov = 1, # For test data
regions = regions
) # For arapid testing)
# Add SNPs
asm_snps_file <- system.file("testdata", "test_het_snps.tsv", package = "CAMDAC")
attach_output(tumor, config, "asm_snps", asm_snps_file)
attach_output(normal, config, "asm_snps", asm_snps_file)
Next, CAMDAC requires the allele-specific copy number solution from the tumor, attached as follows:
cna_file <- system.file("testdata", "test_cna.tsv", package = "CAMDAC")
attach_output(tumor, config, "cna", cna_file)
Finally, run the allele-specific methylation pipeline:
asm_pipeline(
tumor = tumor,
infiltrates = normal,
origin = normal,
config = config
)
If you have already run the CAMDAC pipeline in tumor-normal mode, then the germline object’s SNP files will be used by default. The simplest run from BAM to ASM is shown below using matched normals for infiltrates and DMPs:
b_tumor <- system.file("testdata", "tumor.bam", package = "CAMDAC")
b_normal <- system.file("testdata", "normal.bam", package = "CAMDAC")
regions <- system.file("testdata", "test_wgbs_segments.bed", package = "CAMDAC") # speed up tests
tumor <- CamSample(id = "T", sex = "XY", bam = b_tumor)
normal <- CamSample(id = "N", sex = "XY", bam = b_normal)
config <- CamConfig(
outdir = "./test_results", bsseq = "wgbs", lib = "pe",
build = "hg38", n_cores = 10,
regions = regions,
min_cov = 1, # For test data
cna_caller = "ascat" # Battenberg always recommended, however ASCAT used here for rapid testing.
)
# Run main CAMDAC generate SNP files for ASM
# Deconvolution skipped here for simplicity.
pipeline(tumor, germline = normal, infiltrates = NULL, origin = NULL, config)
# Run ASM pipeline
asm_pipeline(
tumor = tumor,
germline = normal,
infiltrates = normal,
origin = normal,
config = config
)
vignettes("pipeline")
.