Building Genomic Data Processing and Machine Learning Workflows Using Apache Spark

Показать описание

"Epinomics is advancing epigenetic research to drive personalized medicine, using epigenomic data analysis. Their goal is to provide an analysis resource to the community that will promote high-quality data and replicable and interpretable results. They work with academic and commercial users to ingest and analyze their genomic sequencing data and metadata. They extract epigenetic features from the sequenced genome, called ""chromatin accessibility"", which are indicative of instrumental epigenetic changes responsible for differential gene expression and disease development.

Epinomics has built an Apache Spark-based pipeline that retrieves chromatin accessibility data from the epigenome, uses GraphX to find overlapping accessibility atlas and then clusters the data and runs machine learning algorithms. This session will provide a primer on epigenomics, details about Epinomics' Spark-based data pipeline focusing on parallel bioinformatic analysis, and how they use machine learning models to build the epigenomic landscape and accelerate the field of personalized immunotherapy. use GraphX to find overlapping accessibility atlas and then cluster the data and run machine learning algorithms.

In this talk we will provide a primer on epigenomics, details about our Spark based data pipeline focusing on parallel bioinformatic analysis and how we use machine learning models to build the epigenomic landscape and accelerate the field of personalized immunotherapy.

Session hashtag: #SFr11"

About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.

Connect with us:

Рекомендации по теме

Building Genomic Data Processing and Machine Learning Workflows Using Apache Spark

Building Genomic Data Processing and Machine Learning Workflows Using Apache Spark

How to Sequence a Human Genome in 7 'Easy' Steps!

Course Curriculum for 'Genomics: Processing, Analysis & Interpretation of Genomic Data&apos...

Creating a Home for Genomic Data in the Electronic Heath Record

From Genomic Data to Genomic Knowledge | Sandy Starr | TEDxRoyal Holloway

Genomic Data Analysis Webinar

Processing 70Tb Of Genomics Data With ADAM And Toil

Using Large Scale Genomic Databases to Improve Disease Variant Interpretation

AWS re:Invent 2024 - Accelerate & automate secure data transfers at scale with AWS DataSync (STG...

CppCon 2014: Mauricio Carneiro 'Gamgee: A C++14 library for genomics data processing and analys...

Genomic data - improving discovery and access management

GDC DNA-Seq Data Processing – September 28, 2020 GDC Monthly Webinar

Scalable genomic data processing and interoperable systems with ADAM/Spark - Andy and Xavier

Making sense of genomic data

Building genomic-scale cloud pipelines

Omics Logic Genomics - Learn about Analysis of Genomic Data

Genomic data analysis for beginners - a playlist introduction

Data Overload! Making Sense of Genome Sequencing with Bioinformatics

MPG Primer: Data processing & analysis of genetic variation using next-gen sequencing (2012)

Genomic Data Analysis for Beginners #genomics #bioinformatics

Genomic Data Science Introduction

Nationwide Children’s Hospital Leverages AWS Technology for Processing and Securing Clinical Genomic...

Analyzing Clinical and Genomic Oncological Data with {genieBPC} and {gnomeR}

Bioinformatics Pipelines for Beginners