What is Apache Hive? : Understanding Hive

Показать описание

ATTENTION DATA SCIENCE ASPIRANTS:
Click Below Link to Download Proven 90-Day Roadmap to become a Data Scientist in 90 Days

In this video, you will get a quick overview of Apache Hive, one of the most popular data warehouse components on the big data landscape. It’s mainly used to complement the Hadoop file system with its interface.
Hive was originally developed by Facebook and is now maintained as Apache hive by Apache software foundation. It is used and developed by biggies such as Netflix and Amazon as well.

Why was Hive Developed
=====================
The Hadoop ecosystem is not just scalable but also cost effective when it comes to processing large volumes of data. It is also a fairly new framework that packs a lot of punch. However, organizations with traditional data warehouses are based on SQL with users and developers that rely on SQL queries for extracting data.

It makes getting used to the Hadoop ecosystem an uphill task. And that is exactly why hive was developed.

Hive provides SQL intellect, so that users can write SQL like queries called HQL or hive query language to extract the data from Hadoop. These SQL likes queries will be converted into map reduce jobs by the Hive component and that is how it talks to Hadoop ecosystem and HDFS file system.

How and when Hive can be used?
===========================
 Hive can be used for OLAP (online analytic) processing
 It is scalable, fast and flexible
 It is a great platform for the SQL users to write SQL like queries to interact with the large datasets that reside on HDFS filesystem
Here is what Hive cannot be used for:
==============================
 It is not a relational database
 It cannot be used for OLTP (online transaction) processing
 It cannot be used for real time updates or queries
 It cannot be used for scenarios where low latency data retrieval is expected, because there is a latency in converting the HIVE scripts into MAP REDUCE scripts by Hive
Some of the finest features of Hive
============================
 It supports different file formats like sequence file, text file, avro file format, ORC file, RC file
 Metadata gets stored in RDBMS like derby database
 Hive provides lot of compression techniques, queries on the compressed data such as SNAPPY compression, gzip compression
 Users can write SQL like queries that hive converts into mapreduce or tez or spark jobs to query against hadoop datasets
 Users can plugin mapreduce scripts into the hive queries using UDF user defined functions
 Specialized joins are available that help to improve the query performance
If you don’t understand any of the above terms, that is fine. We will look into the above features in detail in our upcoming videos.

Рекомендации по теме

Комментарии

Yo! Thanks for the video, really insightful and concrete. 5:23 minutes of my life well spent.

ricardomarino

I didn't like the comparison between Hive and RDBMS. Hive is for processing data and RDBMS for storing data. You could say Hive+HDFS to avoid confusion. Anyway thank's for the introduction !

Ayoub-adventures

Sir, will you please give me answer to this? What approach we should take to load thousands of small 1 KB files using Hive, do we load one by one or should we merge together and load at once and how to do this?

deepikakumari

what does HDFS stand for?
explanation please...

iskandarsyah

A commercial RDBMS machine has more than 10's of terabytes of just ram. RDBMS can manage much larger datasets, not 10 terabytes..

pradeep

What is Apache Hive? : Understanding Hive

What is Apache Hive? : Understanding Hive

Apache Hive Introduction & Architecture

What Is Apache Hive? | Apache Hive Tutorial | Hive Tutorial For Beginners | Simplilearn

Hive Tutorial For Beginners | What Is Hive | Hive In Hadoop | Apache Hive Tutorial | Simplilearn

What is Apache hive

Apache Hive Hadoop Ecosystem - Big Data Analytics Tutorial by Mahesh Huddar

Hadoop In 5 Minutes | What Is Hadoop? | Introduction To Hadoop | Hadoop Explained |Simplilearn

Apache Hive: The big data warehouse

What is Apache Hive | Hive in Hadoop Tutorial for Beginners | Hive Training | Edureka

Apache Hive ll Hadoop Ecosystem Component ll Explained with Example in Hindi

Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Hadoop | Simplilearn

Introduction to Apache Hive Tutorial | What Is Apache Hive And Who Uses It?

Demonstration of Apache Hive

Hadoop🐘Ecosystem | All Components Hdfs🐘,Mapreduce,Hive🐝,Flume,Sqoop,Yarn,Hbase,Zookeeper🪧,Pig🐷...

Hive Overview

Hive Tutorial | Hive Course For Beginners | Intellipaat

Hive Tutorial For Beginners | What Is Hive | Hive In Hadoop | Apache Hive Tutorial | Simplilearn

Hive architecture | Explained with a Hive query example

Apache Hive Tutorial For Beginners | Big Data Training | Edureka | Big Data Rewind

What is Hive and HiveQL? | Apache Hive Tutorial for Beginners | Hive Architecture | COSO IT

What is difference between Hadoop and Hive?

Ch.03-18 - Introduction to Apache Hive | Hive | Hadoop

What is Apache Hive? | Apache Hive Tutorial for Beginners | Data Models in Apache Hive | 360DigiTMG

Apache Hive - Quick Tutorial