Member since
03-18-2020
3
Posts
0
Kudos Received
0
Solutions
03-18-2020
01:48 AM
I changed the port no from 8888 to 9083 and it is working fine but when i tried to show the query result it shows; df.show() Traceback (most recent call last): File "<ipython-input-12-1a6ce2362cd4>", line 1, in <module> df.show() File "C:\spark-2.3.2-bin-hadoop2.7\spark-2.3.2-bin-hadoop2.7\python\pyspark\sql\dataframe.py", line 350, in show print(self._jdf.showString(n, 20, vertical)) File "C:\spark-2.3.2-bin-hadoop2.7\spark-2.3.2-bin-hadoop2.7\python\lib\py4j-0.10.7-src.zip\py4j\java_gateway.py", line 1257, in __call__ answer, self.gateway_client, self.target_id, self.name) File "C:\spark-2.3.2-bin-hadoop2.7\spark-2.3.2-bin-hadoop2.7\python\pyspark\sql\utils.py", line 79, in deco raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace) IllegalArgumentException: 'java.net.UnknownHostException: quickstart.cloudera' Can you help me regarding this @HadoopHelp
... View more
03-18-2020
12:11 AM
I am using spark 2.3.2 and i am trying to read tables from database. I established spark connection.
But i am unable to read database tables from HUE cloudera and unable to query them in pyspark as well.
Here is my code,
import findspark findspark.init('C:\spark-2.3.2-bin-hadoop2.7\spark-2.3.2-bin-hadoop2.7') import pyspark from pyspark.sql import SparkSession spark = SparkSession.builder.config("hive.metastore.uris", "thrift://10.1.1.70:8888").enableHiveSupport().getOrCreate() #spark.catalog.listTables("tp_policy_operation") import pandas as pd sc = spark.sparkContext sc
from pyspark import SparkContext from pyspark.sql import SQLContext sql_sc = SQLContext(sc) SparkContext.setSystemProperty("hive.metastore.uris", "thrift://10.1.1.70:8888") spark.sql("SELECT * FROM tp_policy_operation")
######The error i am getting
Traceback (most recent call last):
File "<ipython-input-4-8f0aa5852b01>", line 16, in <module> spark.sql("SELECT * FROM tp_policy_operation")## Database ?
File "C:\spark-2.3.2-bin-hadoop2.7\spark-2.3.2-bin-hadoop2.7\python\pyspark\sql\session.py", line 710, in sql return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
File "C:\spark-2.3.2-bin-hadoop2.7\spark-2.3.2-bin-hadoop2.7\python\lib\py4j-0.10.7-src.zip\py4j\java_gateway.py", line 1257, in __call__ answer, self.gateway_client, self.target_id, self.name)
File "C:\spark-2.3.2-bin-hadoop2.7\spark-2.3.2-bin-hadoop2.7\python\pyspark\sql\utils.py", line 69, in deco raise AnalysisException(s.split(': ', 1)[1], stackTrace)
AnalysisException: 'org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.thrift.transport.TTransportException;'
Kindly help me resolve the issue or guide me the changes in the code above.
... View more
Labels:
- Labels:
-
Apache Spark
-
Cloudera Data Explorer