Member since
08-23-2018
36
Posts
1
Kudos Received
0
Solutions
12-09-2019
05:11 PM
Thank for the answer.
... View more
12-09-2019
03:02 AM
https://docs.cloudera.com/documentation/enterprise/6/6.1/topics/admin_ha_hiveserver2.html#concept_u4b_c5d_wv My cluster version is Cloudera CDH 6.1 Express Edition. I try set Hive HA. I added new instance. And I set the address on HiveServer2 Load Balancer property. But Load Balancer server didn't open proxy port. Doesn't Managed cluster start proxy server? Do I have to configure proxy? Managed cluster means Enterprise Edition? If that is, Can I use L4 instead of haproxy?
... View more
Labels:
- Labels:
-
Apache Hive
11-06-2019
10:13 PM
Thanks. However, I have already read them. I'am already connecting to Hive from Zeppelin using JDBC. I want to query Hive Table with SparkSQL. And I'm wondering if the metastore won't crash if I use it in a Cluster using HiveOnSpark. For example. %spark
val df = spark.read.format("csv").option("header", "true")
.option("inferSchema", "true").load("/somefile.csv")
df.createOrReplaceTempView("csvTable");
%spark.sql
select *
from csvTable lt
join hiveTable rt
on lt.col = rt.col
... View more
11-05-2019
10:32 PM
I am using CDH 6.1.1 Cluster. Cluster is configured to use Spark as the execution engine for Hive. Is there anything wrong with using SparkSQL on this Cluster? Is it ok to create Hive Tables and change data using SparkSQL? Since SparkSQL uses the Hive Metastore, I suspect that there may be a conflict between SparkSQL and Hive on Spark. In addition, please refer to documentation on how to intergrate Cloudera CDH Hive with Apache Zeppelin's Spark interpreter. Thank you.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark
-
Apache Zeppelin
08-04-2019
08:54 PM
1 Kudo
Thanks you. Yes, It is not problem of UDF. I have 2 HiveServer2 server host. I have been register udf only 1 server host. Maybe. It is reason.
... View more
06-30-2019
08:50 PM
I use hive on spark. I made UDF. Jar file name is 'hive-udf-20190701.jar' I set hive configuration(Hive Service Advanced Cofniguration snipped(Safety Value) for hive-site.xml. hive.reloadable.aux.jars.path
/usr/local/bigdata/hive-udf I upload jar file to HiveServer2 filesystem directory. /usr/local/bigdata/hive-udf-20190701.jar I create function on Hue. reload;
drop temporary function udf_map_tojson;
create temporary function udf_map_tojson as 'bigdata.hive.udf.MapToJsonString'; I test UDF. select udf_map_tojson(str_to_map("k1:v1,k2:v2")); But, Exception raise. Error while processing statement: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Spark job failed due to: Job aborted due to stage failure: Aborting TaskSet 3.0 because task 0 (partition 0) cannot run anywhere due to node and executor blacklist. Most recent failure: Lost task 0.0 in stage 3.0 (TID 3, worker09.example.com, executor 1): UnknownReason Blacklisting behavior can be configured via spark.blacklist.*. What am I wrong?
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark
01-28-2019
05:33 PM
Thanks. I'll try it the way you told me.
... View more
01-16-2019
11:16 PM
I want to create a table with the complex type removed from the avro data in the same schema. This is because Impala does not skipping complex types. Platform is CDH 6.0.1
For Example :
Employee(raw data)
- name : string
- age : int
- additional-info : map<string, string>
Employee(Hive table 1)
- name : string
- age : int
- additional-info : map<string, string>
Employee_For_Implala(Hive table 2)
- name : string
- age : int
Pipeline :
KafkaProducer(Avro Bytes) - Kafka - Flume - HDFS - Hive(Impala)
Flume : KafkaSource - Channel - Sink(AvroEventSerializer$Builder)
I tried changing the sink(serializer.schemaURL, remove Complex type field) but it failed.
I am trying to use morphine now. But this is also failing.
Is there a better way?
... View more
Labels:
- Labels:
-
Apache Flume
-
Apache Kafka
-
HDFS