About bsaini

bsaini · ‎10-14-2015

Ambari (2.1.1) is configued to run with 12 GBs of RAM in a small 4 node cluster. It still freezes up after running fine for a period of time and then just hangs. Here are some of the errors seen in the log files - ambari-server.log 12 Oct 2015 13:04:56,303 ERROR [qtp-client-18920] MetricsPropertyProvider:183 - Error getting timeline metrics. Can not connect to collector, socket error. 12 Oct 2015 13:19:16,643 ERROR [qtp-client-18897] MetricsPropertyProvider:183 - Error getting timeline metrics. Can not connect to collector, socket error. 12 Oct 2015 16:02:46,153 WARN [qtp-client-19308] nio:726 - handle failed 13 Oct 2015 02:19:55,555 WARN [Timer-0] ThreadPoolAsynchronousRunner:608 - com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@4214238c -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks! ambari-server.out Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3175" Exception in thread "alert-event-bus-3179" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3179" Exception in thread "alert-event-bus-3178" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3178" Exception in thread "alert-event-bus-3180" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3180" Exception in thread "alert-event-bus-3181" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3181" Exception in thread "alert-event-bus-3182" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3182" Exception in thread "alert-event-bus-3183" Exception in thread "alert-event-bus-3184" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3183" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "alert-event-bus-3184" Thread Dump shows threads in following states (count - state) 3 java.lang.Thread.State: BLOCKED (on object monitor) 18 java.lang.Thread.State: RUNNABLE 9 java.lang.Thread.State: TIMED_WAITING (on object monitor) 15 java.lang.Thread.State: TIMED_WAITING (parking) 4 java.lang.Thread.State: WAITING (on object monitor) 10 java.lang.Thread.State: WAITING (parking) An impetus consultant working on the effort notice that there were too many open connections to postgres DB. Any ideas are appreciated.

bsaini · ‎10-14-2015

Here is what I see in the `PARTITIONS` table in metastore (special (404,1444064420,0,'created_yr=__HIVE_DEFAULT_PARTITION__/created_mo=__HIVE_DEFAULT_PARTITION__/equip_init_f1=?/equip_nbr_l1=__HIVE_DEFAULT_PARTITION__',452,48,NULL)

bsaini · ‎10-14-2015

When running a Hive CTAS query that was using wrong serde (accidently) the query was killed in the middle which caused a few partitions to get created but the partition looks corrupted.. Notice the non-ascii character in the partition name. /apps/hive/warehouse/mydb.db/mytbl /apps/hive/warehouse/mydb.db/mytbl/created_yr=__HIVE_DEFAULT_PARTITION__ /apps/hive/warehouse/mydb.db/mytbl/created_yr=__HIVE_DEFAULT_PARTITION__/created_mo=__HIVE_DEFAULT_PARTITION__ /apps/hive/warehouse/mydb.db/mytbl/created_yr=__HIVE_DEFAULT_PARTITION__/created_mo=__HIVE_DEFAULT_PARTITION__/equip_init_f1=ϧ /apps/hive/warehouse/mydb.db/mytbl/created_yr=__HIVE_DEFAULT_PARTITION__/created_mo=__HIVE_DEFAULT_PARTITION__/equip_init_f1=ϧ/equip_nbr_l1=__HIVE_DEFAULT_PARTITION__ /apps/hive/warehouse/mydb.db/mytbl/created_yr=__HIVE_DEFAULT_PARTITION__/created_mo=__HIVE_DEFAULT_PARTITION__/equip_init_f1=ϧ/equip_nbr_l1=__HIVE_DEFAULT_PARTITION__/000004_0 /apps/hive/warehouse/mydb.db/mytbl/created_yr=__HIVE_DEFAULT_PARTITION__/created_mo=__HIVE_DEFAULT_PARTITION__/equip_init_f1=? /apps/hive/warehouse/mydb.db/mytbl/created_yr=__HIVE_DEFAULT_PARTITION__/created_mo=__HIVE_DEFAULT_PARTITION__/equip_init_f1=?/equip_nbr_l1=__HIVE_DEFAULT_PARTITION__ /apps/hive/warehouse/mydb.db/mytbl/created_yr=__HIVE_DEFAULT_PARTITION__/created_mo=__HIVE_DEFAULT_PARTITION__/equip_init_f1=?/equip_nbr_l1=__HIVE_DEFAULT_PARTITION__/000083_0 When running a DROP table statement is run, following exception appears in the metastore.log 2015-10-13 17:55:50,660 ERROR [pool-3-thread-35]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(151)) - Error happens in method drop_table_with_environment_context: MetaException(message:Timeout when executing method: drop_table_with_environment_context) at org.apache.hadoop.hive.metastore.Deadline.newMetaException(Deadline.java:187) at org.apache.hadoop.hive.metastore.Deadline.check(Deadline.java:177) at org.apache.hadoop.hive.metastore.Deadline.checkTimeout(Deadline.java:160) at org.apache.hadoop.hive.metastore.ObjectStore.convertToParts(ObjectStore.java:1820) at org.apache.hadoop.hive.metastore.ObjectStore.convertToParts(ObjectStore.java:1807) at org.apache.hadoop.hive.metastore.ObjectStore.access$200(ObjectStore.java:160) at org.apache.hadoop.hive.metastore.ObjectStore$2.getJdoResult(ObjectStore.java:1734) at org.apache.hadoop.hive.metastore.ObjectStore$2.getJdoResult(ObjectStore.java:1725) at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2391) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsInternal(ObjectStore.java:1725) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1719) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) at com.sun.proxy.$Proxy0.getPartitions(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.dropPartitionsAndGetLocations(HiveMetaStore.java:1693) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:1532) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:1737) at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at com.sun.proxy.$Proxy5.drop_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_table_with_environment_context.getResult(ThriftHiveMetastore.java:9256) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_table_with_environment_context.getResult(ThriftHiveMetastore.java:9240) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.metastore.DeadlineException: Timeout when executing method: drop_table_with_environment_context at org.apache.hadoop.hive.metastore.Deadline.check(Deadline.java:174) ... 35 more

bsaini · ‎10-09-2015

Hey Deepesh, thanks for the answer.. I am wondering why would MR get resources and not Tez?

bsaini · ‎10-09-2015

A prospect is experiencing issue with Hive CLI. I know Hive CLI is not a long term solution and beeline is preferred but I am wondering if beeline, Hive View and other front end tools are working, what could cause Hive CLI to not start - $ hive SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.3.0.0-2557/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.3.0.0-2557/spark/lib/spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] WARNING: Use "yarn jar" to launch YARN applications. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.3.0.0-2557/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.3.0.0-2557/spark/lib/spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Logging initialized using configuration in file:/etc/hive/2.3.0.0-2557/0/hive-log4j.properties

bsaini · ‎10-07-2015

Yes. The HDFS gateway

bsaini · ‎10-07-2015

A customer is experiencing slowness with NFS when copying large files through NFS. Question: 1) Is NFS recommended for large files? 2) What performance tuning options are available for NFS Server?

bsaini · ‎10-07-2015

I think either this is function is not supported or I am missing something very basic.. but here is the issue - 1) Uploaded a GZipped CSV format file to HDFS - No issues 2) Created an external table using CSV Serde pointing LOCATION to the file in step 1 above. Once the table is created I am able to run queries without any problems. 3) Running a CTAS query with the exact same table layout but in ORC format causes the error below. Please help ! ------- Error ------- Caused by: java.io.IOException: incorrect header check at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect(Native Method) at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.decompress(ZlibDecompressor.java:228) at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:91) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) at java.io.InputStream.read(InputStream.java:101) at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180) at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174) at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:246) at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:47) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) ... 22 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1443886863664_0003_1_00 [Map 1] killed/failed due to:null]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:170) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1653) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1412) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154) ... 11 more

bsaini · ‎10-06-2015

+1 Connection refused translates to - Unable to reach service because port is wrong. See more details here - https://wiki.apache.org/hadoop/ConnectionRefused

bsaini · ‎10-06-2015

Hi Mike - I am assuming you have already tried restarting the Ambari Agent? I have seen stale agent process causing similar issue as well.

Online	Offline
Last Visited	‎04-06-2018 07:42 PM

Member Since	‎09-24-2015 03:23 PM
Last Visited	‎04-06-2018 07:42 PM
Posts	178
Kudos received	103

Cloudera Community

Re: Which is better to create Hadoop accounts in L...

Re: Last step of Ambari HDP installation fails for...

Re: How to create falcon entity dependencies?

Re: Where is the output of an Oozie workflow store...

Re: Hi I am new to falcon , can anyone help me wit...

Ambari freezes after running fine for a period of ...

Re: Unable to drop Hive table due to corrupt parti...

Unable to drop Hive table due to corrupt partition...

Re: Hive CLI unresponsive

Hive CLI unresponsive

Re: How to improve the performance of NFS Server?

How to improve the performance of NFS Server?

Unable to run CTAS query using external table with...

Re: Intermittent Teradata ConnectorException error

Re: Why is ambari showing incorrect number of data...