AWS re:Invent 2018: [REPEAT 1] A Deep Dive into What's New with Amazon EMR (ANT340-R1)

Показать описание

Amazon EMR is one of the largest Spark and Hadoop service providers in the world, enabling customers to run ETL, machine learning, real-time processing, data science, and low-latency SQL at petabyte scale. In this session, we introduce design patterns such as using Amazon S3 instead of HDFS, taking advantage of both long- and short-lived clusters, using notebooks, and other architectural best practices. We discuss lowering cost with Auto Scaling and Spot Instances, and security best practices for encryption and fine-grained access control. We showcase key improvements made to the service in 2017. We cover improvements in using the Amazon EMR API, best practices utilizing Spot instances and Spot Instances with Auto Scaling, improvements toward Amazon S3 performance on Amazon EMR, and security/authorization and authentication. We couple each of these with a demo or customer use case to illustrate the benefits. If you are an existing Amazon EMR user, you walk away with a thorough understanding of improvements made in 2018, and how they benefit you. If you are a new Amazon EMR user, get an understanding of common use cases and how other customers are using Amazon EMR.

Рекомендации по теме

Комментарии

Very nice deep dive session, guys! Sometimes, it's just PowerPoint slides with some architecture diagrams, but this one shows real world scenarios, code, etc.

CarlosICC

Can we connect EMR Notebooks to Glue clusters?

pashaonline

AWS re:Invent 2018: [REPEAT 1] A Deep Dive into What's New with Amazon EMR (ANT340-R1)

AWS re:Invent 2018: [REPEAT 1] Building Microservices with Containers (CON308-R1)

AWS re:Invent 2018: [REPEAT 1] Executing a Large-Scale Migration to AWS (ENT205-R1)

AWS re:Invent 2018: [REPEAT 1] Migrating to AWS Fargate (CON311-R1)

AWS re:Invent 2018: [REPEAT 1] Become an IAM Policy Master in 60 Minutes or Less (SEC316-R1)

AWS re:Invent 2018: [REPEAT 1] Hands-on in the AWS Java Ecosystem (DEV325-R1)

AWS re:Invent 2018: [REPEAT 1] Managing Modern Infrastructure in Enterprises (ENT227-R1)

AWS re:Invent 2018: [REPEAT 1] Building Massively Parallel Event-Driven Architectures (SRV373-R1)

AWS re:Invent 2018: [REPEAT 1] Releasing Mission-Critical Software at Amazon (DEV209-R1)

AWS re:Invent 2018: [REPEAT 1] AWS, I Choose You: Pokemon's Battle against the Bots (SEC402-R1)

AWS re:Invent 2018: [REPEAT 1] Inside AWS: Technology Choices for Modern Applications (SRV305-R1)

AWS re:Invent 2018: [REPEAT 1] A Serverless Journey: AWS Lambda Under the Hood (SRV409-R1)

AWS re:Invent 2018: [REPEAT 1] Amazon EC2 Foundations (CMP208-R1)

AWS re:Invent 2018: [REPEAT 1] Moving to DevOps the Amazon Way (DEV210-R1)

AWS re:Invent 2018: [REPEAT 1] Databases on AWS: The Right Tool for the Right Job (DAT205-R1)

AWS re:Invent 2018: [REPEAT 1] Scaling Up to Your First 10 Million Users (ARC205-R1)

AWS re:Invent 2018: [REPEAT 1] Continuous Integration Best Practices (DEV319-R1)

AWS re:Invent 2018: [REPEAT 1] From One to Many: Evolving VPC Design (ARC309-R1)

AWS re:Invent 2018: [REPEAT 1] Mastering Kubernetes on AWS (CON301-R1)

AWS re:Invent 2018: [REPEAT 1] AIOps: Steps Towards Autonomous Operations (DEV301-R1)

AWS re:Invent 2018: [REPEAT 1] Using AWS Lambda as a Security Team (SEC322-R1)

AWS re:Invent 2018: [REPEAT 1] Security & Compliance for Modern Serverless Applications (SRV319-...

AWS re:Invent 2018: [REPEAT 1] Serverless Stream Processing Pipeline Best Practices (SRV316-R1)

AWS re:Invent 2018: [REPEAT 1] Enterprise DevOps: Patterns of Efficiency (ENT311-R1)

AWS re:Invent 2018: [REPEAT 1] Run Production Workloads on Spot, Save up to 90% (CMP306-R1)