About desind

desind · ‎10-20-2017

I am facing a similar issue and after looking at the jon history server i see that last mapper has failed. The logs from that map task is just this. 2017-10-13 12:15:24,376 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_e25_1505390873369_5614_01_000002 2017-10-13 12:15:24,376 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0 CompletedMaps:27 CompletedReds:0 ContAlloc:33 ContRel:4 HostLocal:0 RackLocal:0 2017-10-13 12:15:24,376 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1505390873369_5614_m_000000_0: Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 2017-10-13 12:18:17,290 INFO [IPC Server handler 4 on 55492] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1505390873369_5614_m_000010_0 is : 1.0 2017-10-13 12:18:17,385 INFO [IPC Server handler 3 on 55492] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1505390873369_5614_m_000010_0 is : 1.0 2017-10-13 12:18:17,388 INFO [IPC Server handler 11 on 55492] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Done acknowledgement from attempt_1505390873369_5614_m_000010_0 2017-10-13 12:18:17,388 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1505390873369_5614_m_000010_0 TaskAttempt Transitioned from RUNNING to SUCCESS_FINISHING_CONTAINER 2017-10-13 12:18:17,388 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with attempt attempt_1505390873369_5614_m_000010_0 2017-10-13 12:18:17,389 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1505390873369_5614_m_000010 Task Transitioned from RUNNING to SUCCEEDED 2017-10-13 12:18:17,389 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 28 2017-10-13 12:18:17,667 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0 CompletedMaps:28 CompletedReds:0 ContAlloc:33 ContRel:4 HostLocal:0 RackLocal:0 2017-10-13 12:19:23,003 INFO [Ping Checker] org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: Expired:attempt_1505390873369_5614_m_000010_0 Timed out after 60 secs 2017-10-13 12:19:23,003 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Task attempt attempt_1505390873369_5614_m_000010_0 is done from TaskUmbilicalProtocol's point of view. However, it stays in finishing state for too long 2017-10-13 12:19:23,003 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1505390873369_5614_m_000010_0 TaskAttempt Transitioned from SUCCESS_FINISHING_CONTAINER to SUCCESS_CONTAINER_CLEANUP 2017-10-13 12:19:23,003 INFO [ContainerLauncher #9] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_e25_1505390873369_5614_01_000012 taskAttempt attempt_1505390873369_5614_m_000010_0 2017-10-13 12:19:23,003 INFO [ContainerLauncher #9] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING attempt_1505390873369_5614_m_000010_0 2017-10-13 12:19:23,010 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1505390873369_5614_m_000010_0 TaskAttempt Transitioned from SUCCESS_CONTAINER_CLEANUP to SUCCEEDED 2017-10-13 12:19:23,775 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_e25_1505390873369_5614_01_000012 2017-10-13 12:19:23,775 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:28 CompletedReds:0 ContAlloc:33 ContRel:4 HostLocal:0 RackLocal:0 2017-10-13 12:19:23,775 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1505390873369_5614_m_000010_0: Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143

desind · ‎10-05-2017

@saranvisa What are the implications for increasing mapreduce/reduce.memory.mb and mapreduce.reduce.java.opts to a higher value in the cluster itself ? One of them would be that jobs that do not need this additional memory will get it. which is of no use Other jobs during that time may be impacted Anything else ?

desind · ‎09-26-2017

@saranvisa The last reducer of my mapreduce job fails with the below error. 2017-09-20 16:23:23,732 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.regex.Matcher.<init>(Matcher.java:224) at java.util.regex.Pattern.matcher(Pattern.java:1088) at java.lang.String.replaceAll(String.java:2162) at com.sas.ci.acs.extract.CXAService$myReduce.parseEvent(CXAService.java:1612) at com.sas.ci.acs.extract.CXAService$myReduce.reduce(CXAService.java:919) at com.sas.ci.acs.extract.CXAService$myReduce.reduce(CXAService.java:237) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 2017-09-20 16:23:23,834 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping ReduceTask metrics system... 2017-09-20 16:23:23,834 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system stopped. 2017-09-20 16:23:23,834 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system shutdown complete. Current settings: mapreduce.map.java.opts -Djava.net.preferIPv4Stack=true -Xmx3865051136 mapreduce.reduce.java.opts -Djava.net.preferIPv4Stack=true -Xmx6144067296 1) do you recommend increasing the following properties to the below values ? "mapreduce.map.java.opts","-Xmx4g" "mapreduce.reduce.java.opts","-Xmx8g" 2) These are my map and reduce memory current settings. Do i also need to bump up my reduce memory to 10240m ? mapreduce.reduce.memory.mb 8192 mapreduce.reduce.memory.mb 8192

desind · ‎07-13-2017

@csguna 2.6.32-573.22.1.el6.x86_64 Redhat 6.7

desind · ‎07-10-2017

This issue is resolved after adding the hostname flag and restarted the cluster. thank you guys.

desind · ‎07-09-2017

Full log file : I0707 09:15:32.058609 5861 logging-support.cc:294] Old log file deleted during log rotation: /var/log/statestore/statestored.cba24uu.impala.log.ERROR.20170621-100605.21240 I0707 09:15:34.784071 6030 statestore.cc:696] Unable to send topic update message to subscriber catalog-server@cba24uu.abc.cdb.com:26000, received error: Unexpected registration ID: 744638b525bdc432:7fe0e50a41d1a684, was expecting 6f42d3d3ce4b50ec:b68d9b0341657791 I0707 09:15:36.729167 6596 statestore.cc:381] Registering: catalog-server@cba24uu.abc.cdb.com:26000 I0707 09:15:36.730576 6596 statestore.cc:404] Subscriber 'catalog-server@cba24uu.abc.cdb.com:26000' registered (registration id: dd4a22df064b0c6f:2942c05b6aa152a3) I0707 09:15:36.730842 6029 client-cache.h:260] client 0x4a05000 unexpected exception: TTransportException: Transport not open, type=N6apache6thrift9transport19TTransportExceptionE I0707 09:15:36.730855 6029 client-cache.cc:81] ReopenClient(): re-creating client for cba24uu.abc.cdb.com:23020 I0707 09:15:36.730857 6042 client-cache.h:260] client 0x4a05140 unexpected exception: TTransportException: Transport not open, type=N6apache6thrift9transport19TTransportExceptionE I0707 09:15:36.730866 6042 client-cache.cc:81] ReopenClient(): re-creating client for cba24uu.abc.cdb.com:23020 I0707 09:15:36.767303 6029 statestore.cc:696] Unable to send topic update message to subscriber catalog-server@cba24uu.abc.cdb.com:26000, received error: Unexpected registration ID: dd4a22df064b0c6f:2942c05b6aa152a3, was expecting 6f42d3d3ce4b50ec:b68d9b0341657791 I0707 09:15:38.785959 6031 statestore.cc:696] Unable to send topic update message to subscriber catalog-server@cba24uu.abc.cdb.com:26000, received error

desind · ‎07-09-2017

Ok thanks for confirming. Once we made that change and restarted impala we saw another issue. We added --hostname=cba24uu.abc.cdb.com to the below two settings and restarted impala. Catalog Server Command Line Argument Advanced Configuration Snippet (Safety Valve) Statestore Command Line Argument Advanced Configuration Snippet (Safety Valve) -- Error after change: Statestore Logs 0707 09:15:34.784071 6030 statestore.cc:696] Unable to send topic update message to subscriber catalog-server@cba24uu.abc.cdb.com:26000, received error: Unexpected registration ID: 744638b525bdc432:7fe0e50a41d1a684, was expecting 6f42d3d3ce4b50ec:b68d9b0341657791

desind · ‎07-09-2017

After looking at https://www.cloudera.com/documentation/enterprise/release-notes/topics/cm_rn_known_issues.html#concept_xnw_cb4_j1b To workaround this issue, upgrade to one of the following versions of Cloudera Manager before upgrading CDH: 5.10.2 5.8.6 It does not mention 5.11.1 . so does this issue surface when using CM 5.11.1 ?

desind · ‎07-08-2017

We currently identified this issue with the impala certificate. we are now looking into it. 1. Check cert openssl s_client -connect $hostname:$port -CAfile /abc/hadoop/cloudera-certs/impala-SAN.pem 2. Run hostname -f (This must give you the FQDN)

desind · ‎07-07-2017

We have upgraded to 5.11.1 and now we are not able to run any impala queries. Error: Query: show databases ERROR: AnalysisException: This Impala daemon is not ready to accept user requests. Status: Waiting for catalog update from the StateStore. Statestore logs: I0706 12:54:32.296458 28189 authentication.cc:427] Successfully authenticated principal impala/cba24uu.abc.cdb.com@ABC.CDB.COM on an internal connection I0706 12:54:32.296932 28401 statestore.cc:381] Registering: catalog-server@cba24uu:26000 I0706 12:54:32.297024 28401 statestore.cc:404] Subscriber 'catalog-server@cba24uu:26000' registered (registration id: 16404957b6105e9d:7340f75c059dbe95) I0706 12:54:32.310817 28156 status.cc:114] Couldn't open transport for cba24uu:23020 (authorize: cannot authorize peer) @ 0x8394e9 (unknown) @ 0xdac876 (unknown) @ 0xdacb92 (unknown) @ 0xa505ab (unknown) @ 0xa50b83 (unknown) @ 0xb36d62 (unknown) @ 0xb39c4e (unknown) @ 0xb400b6 (unknown) @ 0xbdcd09 (unknown) @ 0xbdd6e4 (unknown) @ 0xe2717a (unknown) @ 0x2b5ed7b36aa1 start_thread @ 0x2b5ed7e3493d clone I0706 12:54:32.310847 28156 thrift-client.cc:67] Unable to connect to cba24uu:23020 I0706 12:54:32.310878 28156 statestore.cc:696] Unable to send heartbeat message to subscriber catalog-server@dig24au:26000, received error: Couldn't open transport for cba24uu:23020 (authorize: cannot authorize peer) I0706 12:54:32.316840 28144 status.cc:114] Couldn't open transport for cba24uu:23020 (authorize: cannot authorize peer) If i try to telnet to host and port it works. Catalogue logs: I0707 09:37:11.706931 17577 thrift-server.cc:391] Command '/var/run/cloudera-scm-agent/process/2951-impala-CATALOGSERVER/altscript.sh sec-0-ssl_private_key_password_cmd' executed successfully, .PEM password retrieved I0707 09:37:11.713904 17577 thrift-server.cc:449] ThriftServer 'StatestoreSubscriber' started on port: 23020s I0707 09:37:11.714009 17577 statestore-subscriber.cc:203] Registering with statestore I0707 09:37:11.801826 17577 statestore-subscriber.cc:169] Subscriber registration ID: 664bb584455ec4bf:a5fd7f54e1e7009f I0707 09:37:11.801847 17577 statestore-subscriber.cc:207] statestore registration successful I0707 09:37:11.803041 17577 catalogd-main.cc:91] Enabling SSL for CatalogService I0707 09:37:11.830278 18039 thrift-util.cc:111] TAcceptQueueServer: Caught TException: No more data to read. I0707 09:37:11.830605 17997 HdfsTable.java:1105] Fetched partition metadata from the Metastore: mssql_polybase.sample_data I0707 09:37:11.833709 18039 thrift-util.cc:111] TAcceptQueueServer: Caught TException: No more data to read. I0707 09:37:12.144228 17577 thrift-server.cc:391] Command '/var/run/cloudera-scm-agent/process/2951-impala-CATALOGSERVER/altscript.sh sec-0-ssl_private_key_password_cmd' executed successfully, .PEM password retrieved I0707 09:37:12.151124 17577 thrift-server.cc:449] ThriftServer 'CatalogService' started on port: 26000s I0707 09:37:12.151144 17577 catalogd-main.cc:96] CatalogService started on port: 26000 I0707 09:37:12.232126 17997 TableLoader.java:97] Loaded metadata for: mssql_polybase.sample_data I0707 09:37:12.846177 18039 thrift-util.cc:111] TAcceptQueueServer: Caught TException: No more data to read. I0707 09:37:13.858829 18039 thrift-util.cc:111] TAcceptQueueServer: Caught TException: No more data to read. I0707 09:37:14.869678 18039 thrift-util.cc:111] TAcceptQueueServer: Caught TException: No more data to read.

Online	Offline
Last Visited	‎05-21-2021 07:55 PM

Member Since	‎05-09-2017 07:45 PM
Last Visited	‎05-21-2021 07:55 PM
Posts	107
Kudos received	5

Cloudera Community

Re: Kafka S3 sink connector failing

Re: Reassignment of a replica across Kafka volumes...

Re: Cloudera manager embedded database fails to co...

Re: LDAP/AD authentication failed

Re: sqoop list-databases error

Re: Map jobs are failing with exit code 143

Re: Map and Reduce Error: Java heap space

Re: Map and Reduce Error: Java heap space

Re: Impala Catalogue server down after upgrading f...

Re: Impala Catalogue server down after upgrading f...

Re: Impala Catalogue server down after upgrading f...

Re: Impala Catalogue server down after upgrading f...

Re: Impala Catalogue server down after upgrading f...

Re: Impala Catalogue server down after upgrading f...

Impala Catalogue server down after upgrading from ...