Member since
10-04-2016
243
Posts
281
Kudos Received
43
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1172 | 01-16-2018 03:38 PM | |
6139 | 11-13-2017 05:45 PM | |
3032 | 11-13-2017 12:30 AM | |
1518 | 10-27-2017 03:58 AM | |
28427 | 10-19-2017 03:17 AM |
02-09-2017
06:39 PM
@ed day If you do not specify the file system in the jar path, it will take the local file system path as shown below: hive.aux.jars.path=/home/ed/Downloads/serde/json-serde-1.3.7-jar-with-dependencies.jar gets translated to : hive.aux.jars.path=file:///home/ed/Downloads/serde/json-serde-1.3.7-jar-with-dependencies.jar In your case, you can try by explicitly adding file:// before the jar path. If your jar is in the HDFS then use: hive.aux.jars.path=hdfs:///master.royble.co.uk/jars/json-serde-1.3.7-jar-with-dependencies.jar
P.S. please verify that the hadoop user you are using to execute these, has the read privileges to your local jar path.
... View more
02-09-2017
02:09 PM
One of my talend package is failing when it tries to close the hive connection.
Here is the log snapshot: [FATAL]: alpha.talendPackage - tHiveClose_1 Error while cleaning up the server resources
Exception in component tHiveClose_1
java.sql.SQLException: Error while cleaning up the server resources
at org.apache.hive.jdbc.HiveConnection.close(HiveConnection.java:729)
at alpha.talendPackage.tHiveClose_1Process(TalendPackage.java:3274)
at alpha.talendPackage$1tRunJob_1Thread.run(TalendPackage.java:2983)
at routines.system.ThreadPoolWorker.runIt(TalendThreadPool.java:159)
at routines.system.ThreadPoolWorker.runWork(TalendThreadPool.java:150)
at routines.system.ThreadPoolWorker.access$0(TalendThreadPool.java:145)
at routines.system.ThreadPoolWorker$1.run(TalendThreadPool.java:122)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.thrift.transport.TTransportException: org.apache.http.NoHttpResponseException: abc.com.net:10001 failed to respond
at org.apache.thrift.transport.THttpClient.flushUsingHttpClient(THttpClient.java:297)
at org.apache.thrift.transport.THttpClient.flush(THttpClient.java:313)
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65)
at org.apache.hive.service.cli.thrift.TCLIService$Client.send_CloseSession(TCLIService.java:173)
at org.apache.hive.service.cli.thrift.TCLIService$Client.CloseSession(TCLIService.java:165)
at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1388)
at com.sun.proxy.$Proxy9.CloseSession(Unknown Source)
at org.apache.hive.jdbc.HiveConnection.close(HiveConnection.java:727)
... 7 more
I verified the connectivity between Talend Server and Hive Server (abc.com.net:10001). Also verified the connectivity on the cluster via Knox. What really puzzles me is that it only fails for this particular talend job while the rest of the jobs are working absolutely fine. Thanks in advance.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Tez
02-07-2017
02:04 PM
@William Gonzalez - Thank you for the details, for now I would just stick with exam objectives but I am definitely interested in doing things from scratch to get a in-depth understanding.
... View more
02-07-2017
01:14 PM
For HDPCD, we could use the Hortonworks Sandbox and try to go through the exam objectives to prepare ourselves. For HDPCA, can we use the same Sandbox and try to implement the tasks in Exam Objectives ? Since I am not planning on taking the Hortonworks Official Training course, I would like to know 1. What basic setup(Sandbox/VM with only Linux flavor and no hadoop) do I need to start with ? 2. How to go about preparing for HDPCA, specially what do I need to start with(other than referring the exam objectives).
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Training
01-12-2017
10:00 PM
2 Kudos
It is confirmed that nested objects are not supported in JSON via Upload Table function. Here is an excerpt from official documentation: The following json format is supported: [ { "col1Name" : "value-1-1", "col2Name" : "value-1-2"}, { "col1Name" : "value-2-1", "col2Name" : "value-2-2"}] The file should contain a valid json array containing any number of json objects. Each json object should contain column names as property and column values as property values. The names, number and order of columns in the table are decided from the first object of the json file. The names and datatype of column can be edited during the preview step. If some json objects have extra properties then they are ignored. If they do not have some of the properties then null values are assumed. Note that extension of files cannot be “.json”
... View more
01-10-2017
04:24 PM
@Pardeep @vsithannan - Appreciate any inputs from you guys. I have tried to find the answer on multiple forums to no avail and I am finally posting it here.
... View more
01-10-2017
04:22 PM
1 Kudo
When I try to upload a simple JSON using Upload Table in Ambari>Hive View I am able to do it. When I try to upload a nested JSON ( containing one or more different arrays ), I get "E090 Row data cannot have an array. [IllegalArgumentException]." I am beginning to wonder if Upload Table supports loading complex nested JSON. Attached complex.txt file that I am trying to load. Please rename it to .json if you want to replicate the issue. Thank you.
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hive
01-10-2017
07:18 AM
"perform a join on two or more datasets" - implies that there are more than 2 data sets involved and thus you may ave to write a solution which can comprise only a Map Join or only a Reduce Join or a combination of both. In essence, if the data sets are too large and could result in memory issues, then bloom filter is the route to take. So from a conceptual perspective, it is good to know Bloom Filter even if it is not specifically mentioned in Exam Objectives.
... View more
01-10-2017
07:13 AM
2 Kudos
@Ramesh Raja In the exam you may or may not be required to remove the header. It is better to know how to do it and feel more comfortable. To remove header in Hive use tblproperties: Create table test(
name string,
email string
)
tblproperties("skip.header.line.count"="1");
//Now load the data into the table To remove header in Pig: A=load 'data.csv' using PigStorage(',');
B=FILTER A BY $0>1;
... View more