Filter bulk tumour and normal methylation data, get methylation rate highest density interval (HDI) and plot raw methylation info run_methylation_data_processing

run_methylation_data_processing(
  patient_id,
  sample_id,
  normal_infiltrates_proxy_id,
  normal_origin_proxy_id,
  path,
  min_normal = 10,
  min_tumour = 3,
  n_cores,
  reference_panel_normal_infiltrates = NULL,
  reference_panel_normal_origin = NULL
)

Arguments

patient_id

Character variable containting the patient ID

sample_id

Character variable with the (control or tumour) sample ID

normal_infiltrates_proxy_id

Character variable with the sample ID of the tissue-matched normal acting as proxy for the tumour infiltrating normal cells. Ideally, this is a patient and tissue-matched tumour adjacent normal sample.

normal_origin_proxy_id

Character variable with the sample ID of the normal to be used as a proxy for the tumour cell of origin in differential methylation analyses.

path

Character path variable pointing to the desired working directory. This is where the output will be stored.

min_normal

Numerical value correspdonding to the minimum counts threshold for the normal CpGs to be included

min_tumour

Numerical value correspdonding to the minimum counts threshold in the tumour sample CpGs inclusion

n_cores

Numerical value correspdonding to the number of cores for parallel processing

reference_panel_normal_infiltrates

Default is NULL. Character string with the complete path to a reference methylation profile for the tumour normal infiltrates as a .fst file.

reference_panel_normal_origin

Default is NULL. Character string with the complete path to your reference methylation profile for the tumour cell of origin as a .fst file.

If a patient-matched proxy for the normal infiltrates and/or the normal cell of origin is not available, a reference panel may be constructed from different individuals and used as a substitute.

The reference samples should be at the very least sex-matched.

The reference should be saved as a .fst file with the following columns: CHR start end M_n UM_n m_n cov_n

where each row is a CpG or CCpGG with coordinates CHR:start-end The start and end columns correspond to the 5'-C and 3'-G coordinate, respectively. M_n is the number of reads supporting of the methylated allele UM_n is the number of reads supporting of the unmethylated allele m_n is the normal methylation rate (M_n / (M_n+UM_n)) cov_n is the total CpG methylation informative reads counts (M_n+UM_n)

Value

GRanges object in .RData file