Member since
02-09-2015
95
Posts
8
Kudos Received
9
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1974 | 08-23-2021 04:07 PM | |
668 | 06-30-2021 07:34 AM | |
695 | 06-30-2021 07:26 AM | |
8959 | 05-17-2019 10:27 PM | |
2016 | 04-08-2019 01:00 PM |
08-31-2021
01:58 PM
Hi Apache spark will initiate connection to your db on that port only via jdbc , so you can open a firewall where sources are your nodes ips and destination is your db server ip on the port you specified. Best Regards
... View more
08-31-2021
01:52 PM
Hi, do you have apache ranger installed ? if yes, check that the right policies are added under yarn service and the ranger user sync service is configured and syncing AD users and groups. Best Regards
... View more
08-31-2021
01:31 PM
Hi, can you post the error please? also cluld you please clarify the below : is your cluster having kerberos enabled? also did you enable hdfs extension for druid? whats the data type you are trying to read from hdfs ? Best Regards
... View more
08-31-2021
01:15 PM
1 Kudo
Hi, With Hadoop 3, there is intra node balance as well as the data nodes balance which can help you distribute and balance the data on your nodes cluster. for sure the recommended way is having all data nodes with same number of disks and size, but its is possible to have different config for data nodes but you will need to keep balancing your data nodes quite often which will take computation and network resources. Also another thing to consider when you have disks with different size is "data node volume choosing policy" which is by default set to round robin , you need to consider choosing available space instead. i suggest you to read this article from Cloudera as well. https://blog.cloudera.com/how-to-use-the-new-hdfs-intra-datanode-disk-balancer-in-apache-hadoop/ Best Regards
... View more
08-23-2021
04:07 PM
Hi, can you use beeline and type the below command then recreate the table : set parquet.column.index.access=false; this should make hive not use the index of your create table statement to map the data in your files, but instead it will use the columns names . hope this works for you. Best Regards
... View more
06-30-2021
11:43 PM
You can replace the sentry part of your script with Apache ranger API to create/update/delete ranger policies, example here Ranger RestAPIs for Creating, Updating, Deleting, and Searching Policies in Big SQL - Hadoop Dev (ibm.com)
... View more
06-30-2021
07:34 AM
Make sure that you are using the oracle jdbc driver version which is compatible with the oracle db version you are connecting to
... View more
06-30-2021
07:26 AM
You can check Kafka Mirror Maker here Set up MirrorMaker in Cloudera Manager , also if the 2 clusters are secured via kerberos and reside in 2 different realms you need to make sure there's trust between these 2 kerberos realms
... View more
06-30-2021
07:19 AM
I assume you are using Capacity scheduler not fair scheduler. that's why queues wont take available resources from other queues, you can read more regarding that here Comparison of Fair Scheduler with Capacity Scheduler | CDP Public Cloud (cloudera.com) .
... View more
06-07-2021
11:09 PM
Following are the configurations for connecting Apache Ranger with LDAP/LDAPS. There's an important tool that will help identify some settings in your AD AD Explorer - Windows Sysinternals | Microsoft Docs.
This configuration will sync LDAP users and link them with their LDAP groups every 12 hours, so later from Apache Ranger, you can give permission based on LDAP groups as well.
For connecting using LDAPS, ensure you have the proper certificates added in the same server that contains the Ranger's UserSync service.
Configuration Name
Configuration Value
Comment
ranger.usersync.source.impl.class
org.apache.ranger.ldapusersync.process.LdapUserGroupBuilder
ranger.usersync.sleeptimeinmillisbetweensynccycle
12 hour
ranger.usersync.ldap.url
ldaps://myldapserver.example.com
ldaps or ldap based on your LDAP security
ranger.usersync.ldap.binddn
myuser@example.com
ranger.usersync.ldap.ldapbindpassword
mypassword
ranger.usersync.ldap.searchBase
OU=hadoop,DC=example,DC=com
You can browse your AD and check which OU you want to make Ranger sync
ranger.usersync.ldap.user.searchbase
OU=hadoop2,DC=example,DC=com;OU=hadoop,DC=example,DC=com
You can browse your AD and check which OU you want to make Ranger sync, you can also add 2 OU and separate them with ;
ranger.usersync.ldap.user.objectclass
user
double-check the same
ranger.usersync.ldap.user.searchfilter
(memberOf=CN=HADOOP_ACCESS,DC=example,DC=com)
if you want to filter specific users to be synced in Ranger and not your entire AD
ranger.usersync.ldap.user.nameattribute
sAMAccountName
double-check the same
ranger.usersync.ldap.user.groupnameattribute
memberOf
double check the same
ranger.usersync.user.searchenabled
true
ranger.usersync.group.searchbase
OU=hadoop,DC=example,DC=com
You can browse your AD and check which OU you want to make Ranger sync
ranger.usersync.group.objectclass
group
double-check the same
ranger.usersync.group.searchfilter
(cn=hadoop_*)
if you want to sync specific groups not all AD groups
ranger.usersync.group.nameattribute
cn
double-check the same
ranger.usersync.group.memberattributename
member
double-check the same
ranger.usersync.group.search.first.enabled
true
ranger.usersync.truststore.file
/path/to/truststore-file
ranger.usersync.truststore.password
TRUST_STORE_PASSWORD
Here is a helpful link on how to construct complex LDAP search queries. Search Filter Syntax - Win32 apps | Microsoft Docs
Disclaimer from Cloudera: This article is contributed by an external user. Steps/ Content may not be technically verified by Cloudera and may not be applicable for all use cases and specifically to a particular distribution. Follow with caution and own risk. If needed, raise a support case to get the confirmation.
... View more
05-26-2021
04:57 AM
Hi, Below are configuration for connecting Apache Ranger with LDAP/LDAPS. There's an important tool that will help to identify some settings in your AD AD Explorer - Windows Sysinternals | Microsoft Docs This configuration will sync LDAP users and link them with their LDAP groups every 12 hour, so you later from Apache Ranger you can give permission based on LDAP groups as well. For connecting using LDAPS, make sure you have the proper certificates added in the same server that contains the Ranger's UserSync service. Configuration Name Configuration Value Comment ranger.usersync.source.impl.class org.apache.ranger.ldapusersync.process.LdapUserGroupBuilder ranger.usersync.sleeptimeinmillisbetweensynccycle 12 hour ranger.usersync.ldap.url ldaps://myldapserver.example.com ldaps or ldap based on your LDAP security ranger.usersync.ldap.binddn myuser@example.com ranger.usersync.ldap.ldapbindpassword mypassword ranger.usersync.ldap.searchBase OU=hadoop,DC=example,DC=com you can browse your AD and check which OU you want to make Ranger sync ranger.usersync.ldap.user.searchbase OU=hadoop2,DC=example,DC=com;OU=hadoop,DC=example,DC=com you can browse your AD and check which OU you want to make Ranger sync, you can also add 2 OU and separate them with ; ranger.usersync.ldap.user.objectclass user double check the same ranger.usersync.ldap.user.searchfilter (memberOf=CN=HADOOP_ACCESS,DC=example,DC=com) if you want to filter specific users to be synced in ranger and not your entire AD ranger.usersync.ldap.user.nameattribute sAMAccountName double check the same ranger.usersync.ldap.user.groupnameattribute memberOf double check the same ranger.usersync.user.searchenabled true ranger.usersync.group.searchbase OU=hadoop,DC=example,DC=com you can browse your AD and check which OU you want to make Ranger sync ranger.usersync.group.objectclass group double check the same ranger.usersync.group.searchfilter (cn=hadoop_*) if you want to sync specific groups not all AD groups ranger.usersync.group.nameattribute cn double check the same ranger.usersync.group.memberattributename member double check the same ranger.usersync.group.search.first.enabled true ranger.usersync.truststore.file /path/to/truststore-file ranger.usersync.truststore.password TRUST_STORE_PASSWORD There's some helpful links about how to construct complex LDAP search queries Search Filter Syntax - Win32 apps | Microsoft Docs Best Regards,
... View more
03-30-2021
02:21 AM
Hi @Ninads , I am also using CDP 7.1.4 and having same error when spark connects to hbase, did you manage to identify the issue ? Best Regards,
... View more
02-26-2021
07:31 AM
Hi, can you check the mysql driver version compatibility with your mysql server version ? MySQL :: MySQL Connector/J 8.0 Developer Guide :: 2 Connector/J Versions, and the MySQL and Java Versions They Require , That particular error from your logs : com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Could not create connection to database server.
java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Could not create connection to database server. Then you might need to use different version of mysql driver thats compatible with your mysql server. Best Regards,
... View more
02-26-2021
07:13 AM
Hi, can you share the Ranger logs? they should contain the exact error messages . Best Regards,
... View more
02-26-2021
07:08 AM
2 Kudos
Hi, you can check your Nifi resources, specifically the Java Heap size found in "bootstrap.conf" file and increase that, please check this for Nifi performance best practices HDF/CFM NIFI Best practices for setting up a high ... - Cloudera Community . Best Regards,
... View more
02-26-2021
07:02 AM
1 Kudo
Hi, As you previously had a version of hive in same machine and the error here is referring to hive metastore then its probably due to old config from the old hive installation " / etc / hive / conf". Best Regards,
... View more
05-26-2019
03:27 PM
hi @gabriele
ran
have you managed to make spark read the jaas file while using ooozie ?
... View more
05-17-2019
10:27 PM
also for more documentation about how we found the solution, in this tez jira ticket https://issues.apache.org/jira/browse/TEZ-3894 its mentioned that tez is getting its intermediate files permissions from "fs.permissions.umask-mode" in our dev environment it was set to 022 but 077 in prod and it was same for you as well so thats how we figured this out, also it was difficult as the file.out.index was created with the correct permission but not the file.out which was causing the result of map not readable by yarn user
... View more
05-17-2019
10:12 PM
glad to work with you and your team to get this issue fixed
... View more
05-17-2019
09:49 AM
yeah, sure will happily work with you to get this fixed
... View more
05-16-2019
06:49 PM
Hi, i am running HDP 3.1 (3.1.0.0-78) , i have 10 data nodes , Hive execution engine is TEZ, when i run a query i get this error ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex re-running, vertexName=Map 1, vertexId=vertex_1557754551780_1091_2_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1557754551780_1091_2_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1557754551780_1091_2_00Vertex failed, vertexName=Map 1, vertexId=vertex_1557754551780_1091_2_00, diagnostics=[Vertex vertex_1557754551780_1091_2_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE, Vertex vertex_1557754551780_1091_2_00 [Map 1] failed as task task_1557754551780_1091_2_00_000001 failed after vertex succeeded.]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
INFO : Completed executing command(queryId=hive_20190516161715_09090e6d-e513-4fcc-9c96-0b48e9b43822); Time taken: 17.935 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex re-running, vertexName=Map 1, vertexId=vertex_1557754551780_1091_2_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1557754551780_1091_2_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1557754551780_1091_2_00Vertex failed, vertexName=Map 1, vertexId=vertex_1557754551780_1091_2_00, diagnostics=[Vertex vertex_1557754551780_1091_2_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE, Vertex vertex_1557754551780_1091_2_00 [Map 1] failed as task task_1557754551780_1091_2_00_000001 failed after vertex succeeded.]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 (state=08S01,code=2) when i traced the logs, for example the application id is (application_1557754551780_1091), i checked the path where the output of the Map will be there in (/var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557754551780_1091/output/attempt_1557754551780_1091_2_00_000000_0_10003), the below files are created with these permissions : -rw-------. 1 hive hadoop 28 May 16 16:17 file.out
-rw-r-----. 1 hive hadoop 32 May 16 16:17 file.out.index also in the Node manager logs i found this error: 2019-05-16 16:19:05,801 INFO mapred.ShuffleHandler (ShuffleHandler.java:sendMapOutput(1268)) - /var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557754551780_1091/output/attempt_1557754551780_1091_2_00_000000_0_10003/file.out not found
2019-05-16 16:19:05,818 INFO mapred.ShuffleHandler (ShuffleHandler.java:sendMapOutput(1268)) - /var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557754551780_1091/output/attempt_1557754551780_1091_2_00_000000_0_10003/file.out not found
2019-05-16 16:19:05,821 INFO mapred.ShuffleHandler (ShuffleHandler.java:sendMapOutput(1268)) - /var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557754551780_1091/output/attempt_1557754551780_1091_2_00_000000_0_10003/file.out not found
2019-05-16 16:19:05,822 INFO mapred.ShuffleHandler (ShuffleHandler.java:sendMapOutput(1268)) - /var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557754551780_1091/output/attempt_1557754551780_1091_2_00_000000_0_10003/file.out not found
2019-05-16 16:19:05,824 INFO mapred.ShuffleHandler (ShuffleHandler.java:sendMapOutput(1268)) - /var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557754551780_1091/output/attempt_1557754551780_1091_2_00_000000_0_10003/file.out not found
2019-05-16 16:19:05,826 INFO mapred.ShuffleHandler (ShuffleHandler.java:sendMapOutput(1268)) - /var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557754551780_1091/output/attempt_1557754551780_1091_2_00_000000_0_10003/file.out not found which means that file.out wont be readable by the yarn user, which leads the whole task to fail i also checked the parent directory permissions, i checked the umask for all users (0022), which means that the files inside the output directory should be readable by other users in same group drwx--x---. 3 hive hadoop 16 May 16 16:16 filecache
drwxr-s---. 3 hive hadoop 60 May 16 16:16 output I reran the whole scenario on different cluster, and i see that the file.out has same permissions as file.out.index , and the queries are running fine without any problems (cluster HDP version : 3.0.1.0-187), also when i switched to yarn user, and used vi to make sure that yarn user is able to read content of file.out and it was able to. -rw-r-----. 1 hive hadoop 28 May 16 16:17 file.out
-rw-r-----. 1 hive hadoop 32 May 16 16:17 file.out.index When i shutdown all the node managers and only 1 is up and running, all the queries are running fine, but also the file.out is still being created with same permissions , but i guess as everything is running on same node then N.B : we upgraded from HDP 2.6.2 to HDP 3.1.0.0-78
... View more
Labels:
05-15-2019
04:07 PM
hi guys, i am having same problem, but when i ran a query (select count(*) from table_name) where table is small it runs successfully, but when table is big i get this error, i checked yarn logs and its seems that the problem occurs while data shuffling, so i traced the problem to the node which received the task , i found this error in (/var/log/hadoop-yarn/yarn/hadoop-yarn-nodemanager-myhost.com.log ) /var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557491114054_0010/output/attempt_1557491114054_0010_1_03_000000_1_10002/file.out not found although for other attempts for same application, this file exists normally, and in the yarn application log, after exiting the beeline session , this error appears 2019-05-14 16:19:58,442 [WARN] [Fetcher_B {Map_1} #0] |shuffle.Fetcher|: copyInputs failed for tasks [InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0, pathComponent=attempt_1557754551780_0155_5_00_000000_0_10003, spillType=0, spillId=-1]]
2019-05-14 16:19:58,442 [INFO] [Fetcher_B {Map_1} #0] |impl.ShuffleManager|: Map_1: Fetch failed for src: InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0, pathComponent=attempt_1557754551780_0155_5_00_000000_0_10003, spillType=0, spillId=-1]InputIdentifier: InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0, pathComponent=attempt_1557754551780_0155_5_00_000000_0_10003, spillType=0, spillId=-1], connectFailed: false
2019-05-14 16:19:58,443 [INFO] [Fetcher_B {Map_1} #1] |HttpConnection.url|: for url=http://myhost_name:13562/mapOutput?job=job_1557754551780_0155&dag=5&reduce=0&map=attempt_1557754551780_0155_5_00_000000_0_10003 sent hash and receievd reply 0 ms
2019-05-14 16:19:58,443 [INFO] [Fetcher_B {Map_1} #1] |shuffle.Fetcher|: Failed to read data to memory for InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0, pathComponent=attempt_1557754551780_0155_5_00_000000_0_10003, spillType=0, spillId=-1]. len=28, decomp=14. ExceptionMessage=Not a valid ifile header
2019-05-14 16:19:58,443 [WARN] [Fetcher_B {Map_1} #1] |shuffle.Fetcher|: Failed to shuffle output of InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0, pathComponent=attempt_1557754551780_0155_5_00_000000_0_10003, spillType=0, spillId=-1] from myhost_name
java.io.IOException: Not a valid ifile header
at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.verifyHeaderMagic(IFile.java:859)
at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.isCompressedFlagEnabled(IFile.java:866)
at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readToMemory(IFile.java:616)
at org.apache.tez.runtime.library.common.shuffle.ShuffleUtils.shuffleToMemory(ShuffleUtils.java:121)
at org.apache.tez.runtime.library.common.shuffle.Fetcher.fetchInputs(Fetcher.java:950)
at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:599)
at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:486)
at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:284)
at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:76)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) i am using HDP 3.1 so any suggestions what might be the error ? Thanks
... View more
05-15-2019
02:59 PM
hi, i am having the same problem after upgrading from HDP 2.6.2 to HDp 3.1, although i have alot of resources in the cluster, when i ran a query (select count(*) from table) if the table is small (3k records) it ran successfully, if the table is larger (50k) records i am getting the same vertex failure error, i checked the yarn application log for the failed query i get the below error 2019-05-14 11:58:14,823 [INFO] [TezChild] |tez.ReduceRecordProcessor|: Starting Output: out_Reducer 2
2019-05-14 11:58:14,828 [INFO] [TezChild] |compress.CodecPool|: Got brand-new decompressor [.snappy]
2019-05-14 11:58:18,466 [INFO] [TaskHeartbeatThread] |task.TaskReporter|: Routing events from heartbeat response to task, currentTaskAttemptId=attempt_1557754551780_0137_1_01_000000_0, eventCount=1 fromEventId=1 nextFromEventId=2
2019-05-14 11:58:18,488 [INFO] [Fetcher_B {Map_1} #1] |HttpConnection.url|: for url=http://myhost_name.com:13562/mapOutput?job=job_1557754551780_0137&dag=1&reduce=0&map=attempt_1557754551780_0137_1_00_000000_0_10002 sent hash and receievd reply 0 ms
2019-05-14 11:58:18,491 [INFO] [Fetcher_B {Map_1} #1] |shuffle.Fetcher|: Failed to read data to memory for InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0, pathComponent=attempt_1557754551780_0137_1_00_000000_0_10002, spillType=0, spillId=-1]. len=28, decomp=14. ExceptionMessage=Not a valid ifile header
2019-05-14 11:58:18,492 [WARN] [Fetcher_B {Map_1} #1] |shuffle.Fetcher|: Failed to shuffle output of InputAttemptIdentifier [inputIdentifier=0, attemptNumber=0, pathComponent=attempt_1557754551780_0137_1_00_000000_0_10002, spillType=0, spillId=-1] from myhost_name.com
java.io.IOException: Not a valid ifile header
at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.verifyHeaderMagic(IFile.java:859)
at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.isCompressedFlagEnabled(IFile.java:866)
at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readToMemory(IFile.java:616)
at org.apache.tez.runtime.library.common.shuffle.ShuffleUtils.shuffleToMemory(ShuffleUtils.java:121)
at org.apache.tez.runtime.library.common.shuffle.Fetcher.fetchInputs(Fetcher.java:950)
at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:599)
at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:486)
at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:284)
at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:76)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) both queries were working fine before upgrade, the only change i made after the upgrade is increasing the heapsize of the data nodes, i also followed @Geoffrey Shelton Okot configuration but still same error. thanks
... View more
05-14-2019
01:06 PM
hi @D G , have you fixed this problem ?
... View more
05-13-2019
05:34 PM
Hi, we recently upgraded from HDP 2.6.2 to HDP 3.1 , i am trying to run hive query on beeline, (select count(*) from big_table) where big_table is a table containing millions of records, i got the below error ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1557754551780_0008_4_00, diagnostics=[Task failed, taskId=task_1557754551780_0008_4_00_000000, diagnostics=[TaskAttempt 0 failed, info=[attempt_1557754551780_0008_4_00_000000_0 being failed for too many output errors. failureFraction=1.0, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0], TaskAttempt 1 failed, info=[attempt_1557754551780_0008_4_00_000000_1 being failed for too many output errors. failureFraction=1.0, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0], TaskAttempt 2 failed, info=[attempt_1557754551780_0008_4_00_000000_2 being failed for too many output errors. failureFraction=1.0, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0], TaskAttempt 3 failed, info=[attempt_1557754551780_0008_4_00_000000_3 being failed for too many output errors. failureFraction=1.0, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1557754551780_0008_4_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1557754551780_0008_4_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1, Vertex vertex_1557754551780_0008_4_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
INFO : Completed executing command(queryId=hive_20190513174303_d7607f92-baaa-4fb2-825a-1af9a0287910); Time taken: 15.3 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1557754551780_0008_4_00, diagnostics=[Task failed, taskId=task_1557754551780_0008_4_00_000000, diagnostics=[TaskAttempt 0 failed, info=[attempt_1557754551780_0008_4_00_000000_0 being failed for too many output errors. failureFraction=1.0, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0], TaskAttempt 1 failed, info=[attempt_1557754551780_0008_4_00_000000_1 being failed for too many output errors. failureFraction=1.0, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0], TaskAttempt 2 failed, info=[attempt_1557754551780_0008_4_00_000000_2 being failed for too many output errors. failureFraction=1.0, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0], TaskAttempt 3 failed, info=[attempt_1557754551780_0008_4_00_000000_3 being failed for too many output errors. failureFraction=1.0, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1557754551780_0008_4_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1557754551780_0008_4_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1, Vertex vertex_1557754551780_0008_4_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1 (state=08S01,code=2) but when i execute (select count(*) from small_table) where this table contains only about 3000 records, it runs fine, before the upgrade both queries were running fine, so do i have to make any tuning for hive 3.1 ? N.B: Both tables are external tables Thanks
... View more
Labels:
05-13-2019
07:20 AM
hi @Geoffrey Shelton Okot thanks for your help, but in the document it mentioned that its not required that we have ambari agent running on ambari server machine, but actually its required , when we added the agent to the ambari server machine the error is gone, thanks again for help
... View more
05-11-2019
08:02 AM
hi @Rajesh Sampath @subhash parise, i am also having same error, i am trying to read hive external table, can you please tell me how you fixed it ?
... View more
05-11-2019
07:56 AM
hi @Pavel Stejskal @dbompart i am facing the same problem, and i am not querying a hive managed table, its just an external table in hive, i am able to read the metadata but not the data , can you please tell me how you fixed it ?
... View more
05-10-2019
07:07 AM
Hi, during upgrading from HDP 2.6.2 to HDP 3.1, we faced some issues "starting 2 data nodes was giving OOM exception, so we had to increase the java heap size manually and start them manually as well" that required manual steps , now we are at "Finalize Upgrade Pre-Check", and everything looks fine in the cluster, so we need to bypass this step in the upgrade, as its giving us error below Upgrade did not succeed on 2 hosts
Your options:
Pause Upgrade, delete the unhealthy hosts and return to the Upgrade Wizard to Proceed.
Perform a Downgrade, which will revert all hosts to the previous stack version.
thanks
... View more
Labels:
05-08-2019
06:57 PM
Hi, I am upgrading to to HDP 3.1, we have Ambari 2.7.3 and HDP 2.6.2 - Kerberos Enabled. ** Ambari Server is installed, but i dont have ambari-agent on that server, so it doesn't appear on the (api/v1/clusters/clustername/hosts?fields=Hosts/ip,Hosts/host_name)** In step (Regenerate Missing Keytabs) create principle its failing, giving an error (Host not found), i also checked the ambari-server log, and found the below exception : Task #3611 failed to complete execution due to thrown exception: org.apache.ambari.server.HostNotFoundException:Host not found, hostname=myhost.mydomain.com org.apache.ambari.server.HostNotFoundException: Host not found, hostname=myhost.mydomain.com at org.apache.ambari.server.state.cluster.ClustersImpl.getHost(ClustersImpl.java:456) at org.apache.ambari.server.state.ConfigHelper.getEffectiveDesiredTags(ConfigHelper.java:189) at org.apache.ambari.server.state.ConfigHelper.getEffectiveDesiredTags(ConfigHelper.java:173) at org.apache.ambari.server.controller.AmbariManagementControllerImpl.findConfigurationTagsWithOverrides(AmbariManagementControllerImpl.java:2371) at sun.reflect.GeneratedMethodAccessor424.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.google.inject.internal.DelegatingInvocationHandler.invoke(DelegatingInvocationHandler.java:50) at com.sun.proxy.$Proxy131.findConfigurationTagsWithOverrides(Unknown Source) at org.apache.ambari.server.state.ConfigHelper.calculateExistingConfigurations(ConfigHelper.java:2158) at org.apache.ambari.server.controller.KerberosHelperImpl.calculateConfigurations(KerberosHelperImpl.java:1725) at org.apache.ambari.server.controller.KerberosHelperImpl.getActiveIdentities(KerberosHelperImpl.java:1800) at org.apache.ambari.server.serveraction.kerberos.KerberosServerAction.calculateServiceIdentities(KerberosServerAction.java:511) at org.apache.ambari.server.serveraction.kerberos.KerberosServerAction.processIdentities(KerberosServerAction.java:455) at org.apache.ambari.server.serveraction.kerberos.CreatePrincipalsServerAction.execute(CreatePrincipalsServerAction.java:92) at org.apache.ambari.server.serveraction.ServerActionExecutor$Worker.execute(ServerActionExecutor.java:550) at org.apache.ambari.server.serveraction.ServerActionExecutor$Worker.run(ServerActionExecutor.java:466) at java.lang.Thread.run(Thread.java:745) i removed the short name from the /etc/hosts, also the DNS server its pointing to the correct IPS, so when i try nslookup from any of the data nodes, its able to resolve the ambari server host name, any suggestions please ?
... View more
Labels: