Intro to Big Data AppHub: Demo of Kafka to HDFS and HDFS to HDFS templates

Показать описание

Abstract:
To make critical business decisions in real time, many businesses today rely on a variety of data, which arrives in large volumes. Variety and volume together make big data applications complex operations. Big data applications require businesses to combine transactional data with structured, semi-structured, and unstructured data for deep and holistic insights.

And, time is of the essence: to derive the most valuable insights and drive key decisions, large amounts of data have to be continuously ingested into Hadoop data lakes as well as other destinations. As a result, data ingestion poses the first challenge for businesses, which must be overcome before embarking on data analysis.

With its various Application Templates for ingestion, DataTorrent allows users to: Ingest vast amounts of data with enterprise-grade operability and performance guarantees provided by its underlying Apache Apex framework. Those guarantees include fault tolerance, linear scalability, high throughput, low latency, and end-to-end exactly-once processing. Quickly launch template applications to ingest raw data, while also providing an easy and iterative way to add business logic and such processing logic as parse, dedupe, filter, transform, enrich, and more to ingestion pipelines. Visualize various metrics on throughput, latency and app data in real-time throughout execution.

Template descriptions:
Kafka to HDFS: The Kafka to HDFS Application Template continuously reads messages from configured Apache Kafka topic(s) and writes each message as a line in Hadoop HDFS file(s).

HDFS Line Copy: The HDFS to HDFS Line Copy Application Template continuously ingests files as lines from between Hadoop clusters, retaining one-to-one file traceability

Presenters:
Ashwin Putta, Product Manager at DataTorrent

Sanjay Pujare, Engineer at DataTorrent

Dr. Munagala V. Ramanath ("Ram"), Committer for Apache Apex and Software Engineer at DataTorrent

Рекомендации по теме

Intro to Big Data AppHub: Demo of Kafka to HDFS and HDFS to HDFS templates

Intro to Big Data App Hub: Kafka to Cassandra App-Template (On-Prem & Cloud/AWS)

Intro to Big Data AppHub: Kinesis to S3 & S3 to HDFS Sync App Templates

Intro to Big Data AppHub: Kafka to Database & Database to HDFS App Templates

Intro to Big Data AppHub: S3 to HDFS Sync App & HDFS to Kafka Sync App Templates

Intro to Big Data AppHub Demo of Kafka to HDFS and HDFS to HDFS templates

Intro to Big Data AppHub: Demo of Kafka to HDFS and HDFS to HDFS templates

Intro to Big Data AppHub: Demo of HDFS to Kafka and Kafka to HDFS templates

Intro to Big Data AppHub: Kinesis to S3 App Template using AWS & HDFS to S3 Sync App Template

Big Data AppHub: Working Smarter, Not Harder

Introduction to AppHub: Kafka to HDFS Filter Application Template

AppHub - Customizable Big Data App Templates (Applications Track) @ Apex Big Data World 2017, Pune

AppHub Platform Overview

Understanding Big Data / Real-Time | SkilledEra | Essential Concepts

Ingest from DB to HDFS with Real-Time Dashboards - Big Data App Template

Visualizing Big Data in Realtime

HDFS to HDFS Line Copy App Template - Import and Launch

Ingest from DB to HDFS with Real-Time Dashboards - Big Data App Template

AppHub Smart Hosting & Email

A New Approach To Building Big Data Apps: Product Recommendations Using Fast Data & Machine Lear...

Visualizing Big Data Applications in Real-Time (Applications Track) @ Apex Big Data World 2017, Pune

Simplifying Big Data Ingestion Challenge by Ashwin Putta & Devendra Vyavahare, DataTorrent

Intro to Apache Apex, the Next Gen Hadoop Platform for Ingest and Transform

HDFS to S3 App Template: Data Movement (On-prem to Cloud/AWS)

Automatic Apps Installation Problem Solved | How to solved automatic downloading problem.