Long-read landscapes of DNA methylation in liver pathophysiology

preview_player
Показать описание
Existing techniques for studying HBV methylation, Chloe explained, conflict with those to study host methylation, making for a confusing picture. For this reason, the team aimed to determine whether nanopore sequencing could identify both the methylation of target mammalian genes as well as HBV methylation in infected hosts.

Initially, whole genome sequencing was performed on MinION, using just 1 µg of hepatocyte DNA. As a first assessment, the nanopore methylation calling data was compared to EPIC array data, resulting in high correlation between the two sample datasets. More specifically, Chloe examined transcription factor binding sites, particularly those of HNF4A and FOXA2 – master transcription factors responsible for switching on most of the genes responsible for cellular differentiation. In these areas the team expected to find no evidence of methylation, and this was born out in the data for both techniques.

However, although whole genome sequencing data gives a lot of information with which to investigate methylation patterns, that quantity of detail is not always necessary to achieve the end goal. From prior work, Chloe and team had already identified a subset of genes that are differentially methylated in hepatocytes and are known to be relevant to hepatocyte identity. But, these genes only make up a very small percentage of the genome, which means the numbers of reads on target when using a whole-genome approach is very low. In addition, Chloe explained, the HBV genome is circular, so without a linearisation step the whole genome approach yielded just one read of the HBV genome.

Taken together, this means the team needed a method of PCR-free targeted sequencing. They decided to use the Cas9 sequencing protocol, using 4-6 guide RNAs per gene, with a 10-gene panel. Trialling this yielded between 60 and 350X coverage per gene, with the bigger genes at the lower end of the coverage scale. This represented a huge increase in percentage on target reads, from 0.00001% in the whole-genome experiment to 6.6% using Cas9.

From here, Chloe went on to look at the methylation profile of the targeted genes, and demonstrated this using master transcription factor HNF4A as an example. This gene is known to have multiple different promoters with alternate methylation statuses, and so looking at each individual molecule gave a picture of that individual cell. Using the nanopore data, differentially methylated loci could be examined, in particular a subset of CpGs that differentiate mature hepatocytes from retro-differentiated cells. The ability to do this gives significant insight into whether the cells sampled are a heterogeneous mixture with different methylation statuses.

But, Chloe asked, what about HBV? Host methylation can be clearly differentiated, but can we apply the same method to the circular HBV genome? HBV has a lot of genotypes and can be up to 8% variant, meaning a well-conserved site needed to be identified in order to allow for linearisation and sequencing. Following this process gave thousands of on-target reads, and more importantly was able to deliver full-length reads of the HBV genome. To the team’s knowledge, this feat had not been achieved previously, meaning they were very happy with the results.

The final remaining step was to look at methylation patterns in the HBV genome, and the team observed that different genotypes in patient data lead to different methylation profiles. In one example, a clear peak was observed in the pre S1-pre S2 region, correlating well with observations in expression.

By visualising individual molecules using native sequencing, methylation could be observed in particular strands but not others, identifying heterogeneity within the HBV infection population. By using k-means clustering, Chloe was able to identify four distinct clusters within the same patient, providing an interesting route for investigation when determining how and if modified bases in a virus contribute to transcription of particular genes.

In the final part of the talk, Chloe explained that in some patients there is a very low quantity of starting material, or poor quality material obtained from a biopsy, meaning they need a method of further enrichment if possible. To achieve this, the team used adaptive sampling – on-device selection of target molecules and rejection of off-target. Using the algorithm UNCALLED (Kovaka et al., 2020), the efficiency of the Cas9-based assay was increased approximately 10-fold, demonstrating that using adaptive sequencing in conjunction with Cas9 enrichment can really improve throughput of on-target reads.

To conclude, Chloe reiterated that their work showed clearly the ability to sequence methylation from both the host and HBV simultaneously, as well as clustering of epigenotypes.
Рекомендации по теме