Hadoop Certification - CCA - Introduction of pyspark (python on spark)

preview_player
Показать описание
Connect with me or follow me at
Рекомендации по теме
Комментарии
Автор

df = sqlContext.load... didn't work for pyspark version 1.6.0

Instead, following worked
df = sqlContext.read.jdbc(url=jdbcurl, table="departments")

behindthescenes-parenting
Автор

Hi Durga... I came across one doubt while going through this video. In hive context how the spark will refer department table since it can be in multiple databases in hive. Will it refer to default database? Thanks in advance.

naveendasharatha
Автор

Quick note on mysql jdbc driver for remote DB:

Encountered an error of 'java.sql.SQLException: No suitable driver'
While tried with,
pyspark --driver-class-path

So slightly changed to the following one that worked

pyspark --conf --driver-class-path --jars

apdevaraj
Автор

Hi Durga,
When I try the example to run pyspark against external database using jdbc below line throws error
df=sqlContext.load(Source="jdbc, url=jdbcurl, dbtable="departments")

I get the error 'SQLContext' object has no attribute 'load'. I am running against pypark 1.2.1.

But when I run the same against pypark 1.5.0 it works ok.
Can u pls advice how to run jdbc against pyspark 1.2.1

vikaskanchan
Автор

Hi Durga,

Instead of using jdbc query in pyspark to validate the data in external db, i can use sqoop eval also right? pls clarify

shaikmuhammadabdulrazak
Автор

Hi, While trying to print dept records I am getting the following error,
An error occurred while trying to connect to the Java server

From internet I found few solutions and tried to run apply them as,
>>> from py4j.java_gateway import JavaGateway
>>> gateway = JavaGateway()

still I am getting same error. Any help will ?

bhavanipatil
Автор

Hello Sir! How can I set the Hadoop tool to the browser? My clouder qiucistart browser is totaly empty by default (there is n Hadoop--Yarn resource manager option. Thanks for your help

garamigergely
Автор

Hello Durga Sir,
Thank you for your wonderful and helpful videos. I have installed 1.2 step by step as you mentioned but I faced one issue where spark context was not available when I launched the spark shell. Below is the exception that I got.

must be set!

To fix this I set the below parameter to false


After changing this property I am able to access sc in both scala and pyspark. Can I continue my preparation with this property being false or will it hamper my preparation to certification. I spent considerable time but could not fix the issue and hence just set that to false. Please advise.

Regards,
Pavan

pavananantharama
Автор

how to connect to databse using Spark1.2.0, I am not getting any syntax as explain in above video.Please help me on this.

sureshpathak
Автор

Hello Durga, I am facing following error:

sqlContext = HiveContext(sc)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'sc' is not defined

can you help ?

yadavrahul
Автор

Hi Durga,
Can you also demonstrate how to install IPython notebook on cloudera vm?

sidhartharay
Автор

when i am trying to start pyspark i am faceing this problem
Remoting: Remoting shut down
16/07/22 15:45:20 INFO Remoting shut down.

Kkanthsri
Автор

For any technical discussions or doubts, please use our forum - discuss.itversity.com
For practicing on state of the art big data cluster, please sign up on - labs.itversity.com
Lab is under free preview until 12/31/2016 and after that subscription
charges are 14.99$ per 31 days, 34.99$ per 93 days and 54.99$ per 185 days

itversity
Автор

Hi Durga,

Can you make a video on explaining optimization in SparkSQL using Python. ?

RahulJain
Автор

java.lang.RuntimeException: native snappy library not available: this version of libhadoop was built without snappy support.
at
at
at
at
at
at
at
at
at
at
at
at
at
at
at
at
at
at
at
at
at
at
at
at
at
at
at
at

ashishgambhir
Автор

Hi Durga,

I am trying to run the below code :

from pyspark.sql import HiveContext
sqlContext = HiveContext(sc)
depts = sqlContext.sql("select * from departments")

u'Table not found: departments; line 1 pos 14'

Can you please help?

praznatejaballa