run_methylation_data_processing
run_methylation_data_processing.Rd
Filter bulk tumour and normal methylation data, get methylation rate highest density interval (HDI)
and plot raw methylation info
run_methylation_data_processing
run_methylation_data_processing(
patient_id,
sample_id,
normal_infiltrates_proxy_id,
normal_origin_proxy_id,
path,
min_normal = 10,
min_tumour = 3,
n_cores,
reference_panel_normal_infiltrates = NULL,
reference_panel_normal_origin = NULL
)
Character variable containting the patient ID
Character variable with the (control or tumour) sample ID
Character variable with the sample ID of the tissue-matched normal acting as proxy for the tumour infiltrating normal cells. Ideally, this is a patient and tissue-matched tumour adjacent normal sample.
Character variable with the sample ID of the normal to be used as a proxy for the tumour cell of origin in differential methylation analyses.
Character path variable pointing to the desired working directory. This is where the output will be stored.
Numerical value correspdonding to the minimum counts threshold for the normal CpGs to be included
Numerical value correspdonding to the minimum counts threshold in the tumour sample CpGs inclusion
Numerical value correspdonding to the number of cores for parallel processing
Default is NULL. Character string with the complete path to a reference methylation profile for the tumour normal infiltrates as a .fst file.
Default is NULL. Character string with the complete path to your reference methylation profile for the tumour cell of origin as a .fst file.
If a patient-matched proxy for the normal infiltrates and/or the normal cell of origin is not available, a reference panel may be constructed from different individuals and used as a substitute.
The reference samples should be at the very least sex-matched.
The reference should be saved as a .fst file with the following columns:
CHR start end M_n UM_n m_n cov_n
where each row is a CpG or CCpGG with coordinates CHR:start-end The start and end columns correspond to the 5'-C and 3'-G coordinate, respectively. M_n is the number of reads supporting of the methylated allele UM_n is the number of reads supporting of the unmethylated allele m_n is the normal methylation rate (M_n / (M_n+UM_n)) cov_n is the total CpG methylation informative reads counts (M_n+UM_n)
GRanges object in .RData file