Member since
04-13-2016
422
Posts
150
Kudos Received
55
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1935 | 05-23-2018 05:29 AM | |
| 4970 | 05-08-2018 03:06 AM | |
| 1685 | 02-09-2018 02:22 AM | |
| 2716 | 01-24-2018 08:37 PM | |
| 6174 | 01-24-2018 05:43 PM |
03-22-2017
10:10 PM
@Kent Brodie In Hive Views setting section can you make changes to below parameter and try? Hive Session Parameters= transportMode=binary;httpPath=cliservice;hive.server2.proxy.user=${username}
Hope this helps you.
... View more
03-16-2017
01:52 AM
@Jay SenSharma We have blocked Hive CLI due to security reasons, do you have something similar for beeline ?
... View more
03-14-2017
11:15 PM
Hi, I have copied data from one cluster to another cluster, later I got the DDL from the existing cluster and ran the same DDL on newly copied data cluster. When I'm trying to check the data I'm getting below error message. Any needful help is highly appreciated. Thanks in advance. Error Message: Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:17, Vertex vertex_1487041727386_3755_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]
ERROR : DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1487041727386_3755_1_00, diagnostics=[Task failed, taskId=task_1487041727386_3755_1_00_000016, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.IOException: ORC does not support type conversion from VARCHAR to STRING
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.IOException: java.io.IOException: ORC does not support type conversion from VARCHAR to STRING
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:196)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.<init>(TezGroupedSplitsInputFormat.java:135)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:101)
at org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:149)
at org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:80)
at org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:650)
at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:621)
at org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:408)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:128)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
... 14 more
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Tez
03-11-2017
08:46 PM
@Rammohan Reddy If you are executing query on smaller dataset, Oracle will be much faster. If you are executing queries on larger dataset, Hive will faster again depends on compute power
... View more
03-11-2017
02:55 PM
@Rammohan ReddyDepends on your use case we can say which one will be faster. NoSQL vs SQL comparison The following table compares the main differences between NoSQL and SQL. Hope this link helps you: http://www.thegeekstuff.com/2014/01/sql-vs-nosql-db/?utm_source=tuicool https://www.dezyre.com/article/nosql-vs-sql-4-reasons-why-nosql-is-better-for-big-data-applications/86 https://docs.microsoft.com/en-us/azure/documentdb/documentdb-nosql-vs-sql
... View more
03-11-2017
02:40 PM
2 Kudos
@Kibrom Gebrehiwot Yes, it's possible. Please use the same Kerberos details(KDC, admin principal etc.,) which you have used for your HDP cluster while Kerberozing the HDF cluster. For the principals which use the machine will not be a problem as each machine will have a unique name. Only you have to be careful while configuring service principals. If you are using service principal followed by cluster name(condition two clusters are having a different name) then even it won't be a problem. example: {service}-{clustername}@{realmname} i.e. hdfs-hadoopprod@Hortonworks.com. But make sure that your KDC is installed and configured on a good machine. If that machine shut down both the clusters will be affected. Kerberos Setup link By configuring both the cluster with single KDC, there is no need to set up the trust between to cluster separately to transfer the data(DistCp etc.,) Hoped this helps you.
... View more
02-17-2017
04:31 PM
1 Kudo
@Baruch AMOUSSOU DJANGBAN You view them in either Ambari Server logs /var/log/ambari-server/ambari-server.logs.
... View more
02-13-2017
04:45 PM
3 Kudos
@suresh krish
Right now in Tez service, I guess this tez.am.view-acls parameter is empty. In order to give access to all to all change it to '*'. If you want only a few people(Admin's) to view all tez jobs and other to view only there jobs, follow below process "AM view ACLs. This setting enables the specified users or groups to view the status of the AM and all DAGs that run within the AM. Format: a comma-separated list of users, a whitespace, and then a comma-separated list of groups. For example, "lhale,msmith administrators,users" Link: https://tez.apache.org/tez_acls.html Hope this helps you.
... View more
02-11-2017
03:17 PM
@NLAMA There might be many reasons for the slow response. 1.Namenode metadata size. 2. Namenode Disk rpm when there is huge metadata. 3. Hardware configuration(RAM, disk mount etc.,) 4. Journal Nodes placement and ZKFC also play a key role in failover. 5. Cache Management I suggest to work with Hortonworks Support to get the perfect route cause that imporves your performance.
... View more
02-11-2017
03:09 PM
1 Kudo
Hi @kishore sanchina There is not a straight command to get long running jobs. In Ambari 2.4, they have provided the Zeppelin Dashboard in SmartSense Service(1.3) where we can see all long running jobs and job which has used maximum memory etc., Link: https://docs.hortonworks.com/HDPDocuments/SS1/SmartSense-1.3.0/bk_user-guide/content/activity_explorer.html Example Screenshot: Option 2: Prior to that, I have written a bash script which is a length process. Where I will gather all the information from Resource Manager URL using Resource Manager REST API calls and store that information in a CSV file. Then load that data in a CSV file to HDFS and then create a Hive external table on top of it. Then I use to run insert command and move the required columns to the final table and started running simple hive queries to get the list of all long running jobs. Hope this helps you.
... View more