Install Apache Spark 2.X - Quick Setup

preview_player
Показать описание
This video covers how you can install Apache Spark 2.0 using the prebuit package

INSTALL SPARK 2.0: (using Prebuilt Packages)
------------------
Prereq: JDK is installed

Step 1: Download Apache Spark 2.0 and extract

Step 2: Add to Path (SPARK_HOME, PATH)

ln -s spark-2.0.1-bin-hadoop2.7 spark
sudo nano bashrc
export SPARK_HOME=/home/yourname/spark
export PATH=$PATH:$SPARK_HOME/bin

save and exit

. .bashrc

Step 3: Verify Spark is working
-------------------------------------------------------

Resources:

Resources:
Рекомендации по теме
Комментарии
Автор

Life saver! Perfect video. Very clear 110/100

amelhadfi
Автор

U are a geneuis.. so simple and so easy..thanks man

sushilkumarsingh
Автор

Great talk!
Don’t we need to setup the HDFS to share the repository with Spark master and all workers?
Can you share the tutorial for this?

Thx

djibb.
Автор

Hey Melvin, really appreciate your video, so I checked to see if your channel has any tutorials on Spark, and voila, a whole series, so i subbed. Anyways, I just wanted to clear something up with you, that is, will i have to do any "building Spark with SBT"? I'm asking since a lot of blog posts/forums are mentioning this? Thanks yous

gdk
Автор

Hello sir, I tried exact same steps but when i start spark-shell it gives error "spark-shell unsupported major.minor version 52.0". I have 4 Gb Ram with 2 CPU core processors. and this is quickstart cloudera VM with "Hadoop 2.6.0- cdh5.8.0" and java version is 1.7.0_67. My laptop RAM is 8gb so I cannt have more than 4gb in my VM. is there any way to install spark2 on it as I already am working on Spark 1.x on it but now I need spark 2.

kashifihsan
Автор

in "spark-2.0.1-bin-hadoop2.7"
is spark built on top of hadoop
can I use mahout in it ?

saisiddhantpanda
Автор

Do I need to have already installed Hadoop to follow this installation?

ManolisFragiadoulakis
Автор

Well, I followed the same steps to install spark . But i ran to this error "Failed to initialize compiler: object java.lang.Object in compiler mirror not found" . BTW i am Java 1.9 . Is that an issue ? Do we have to install Scala prior for working on spark ?

Thanks

c.yaswanth
Автор

Dear Sir,

Spark is not supporting JAVA 1.7 then how can install spark in Hadoop 2.6?

chandankumar-xsfy
Автор

Thanks for the video. BTW, you shouldn't need to sudo to edit your own .bashrc as you own it.

chopiesthechook
Автор

Couple comments, at least from doing this on Fedora:
1) if you install as root user and place in a path like /usr/local/sbin this error happens and you have to chown -R to a non-root user, or chmod -R 777 on metastore_db:
Booting Derby (version The Apache Software Foundation - Apache Derby - 10.12.1.1 - (1704137)) instance
[..]
ERROR PoolWatchThread: Error in trying to obtain a connection. Retrying in 7000ms
java.sql.SQLException: A read-only user or a user in a read-only database is not permitted to disable read-only mode on a connection.

2) You can create a .sh file in /etc/profile.d/ with the contents of the 2 EXPORT commands to get the commands to work for all users.

RobbieTheK