About smayani

smayani · ‎10-04-2017

Hi @Ravindranath Oruganti you are running spark driver in yarn-client mode i.e. on machine where you initiated the spark-submit command. You must also kill this process where you initiated this command.

smayani · ‎08-18-2017

What do you have following property set to "yarn.scheduler.minimum-allocation-mb". when you set map.memory = 2G, you might end up getting a 4GB container with opt 2048.

smayani · ‎03-18-2017

For users having hive insert query with dynamic partition, and partitions (> 10) on a column, you may notice that your query is generating too many small files per partition. INSERT OVERWRITE TABLE dB.Test partition(column5) select column1 ,column2 ,column3 ,column4 ,column5 from Test2; For Example, if your table has 2000 partitions, and your query is generating 1009 reducers (hive.exec.reducers.max), then you might end up with 2 million small files. To Understand "How Does Tez determine the number of reducers" refer: https://community.hortonworks.com/articles/22419/hive-on-tez-performance-tuning-determining-reducer.html) This could result into issues with: 1. HDFS Namenode performance: Refer: https://community.hortonworks.com/articles/15104/small-files-in-hadoop.html 2. File Merge Operation failing due java.lang.OutOfMemoryError: GC overhead limit exceeded . » File Merge , java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:149) at java.lang.StringCoding.decode(StringCoding.java:193) at java.lang.String.(String.java:414) at com.google.protobuf.LiteralByteString.toString(LiteralByteString.java:148) at com.google.protobuf.ByteString.toStringUtf8(ByteString.java:572) at org.apache.hadoop.security.proto.SecurityProtos$TokenProto.getService(SecurityProtos.java:274) at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:848) at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:833) at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1285) at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1435) at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1546) at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1555) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:621) at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176) at com.sun.proxy.$Proxy13.getListing(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2136) at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.hasNextNoFilter(DistributedFileSystem.java:1100) at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.hasNext(DistributedFileSystem.java:1075) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:304) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:265) at org.apache.hadoop.hive.shims.Hadoop23Shims$1.listStatus(Hadoop23Shims.java:148) at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:217) at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:75) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:309) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.processPaths(CombineHiveInputFormat.java:596) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getCombineSplits(CombineHiveInputFormat.java:473) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:571) DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 To avoid this issue, set the following property: set hive.optimize.sort.dynamic.partition=true; When enabled, dynamic partitioning column will be globally sorted. This way we can keep only one record writer open for each partition value in the reducer thereby reducing the memory pressure on reducers.

smayani · ‎01-18-2017

@Aditya Telidevara Are lzo libraries installed on all nodes? Execute the following command at all the nodes in your cluster: RHEL/CentOS/Oracle Linux: yum install lzo lzo-devel hadooplzo hadooplzo-native

smayani · ‎01-06-2017

@Artem Ervits Could you add an example to modify Permissions to a View via Rest API Interface.

smayani · ‎12-22-2016

Property: tez.am.view-acls Location: tez-site.xml Navigation Option: Ambari > Tez > Advanced tez-site > tez-site.xml Default Value: Empty string AM view will only be displayed to users who submitted the job. Description: The Property "tez.am.view-acls" allows you to control users or groups that can view Tez AM jobs. "No default value" or Empty Value indicate that only the users who submitted the job can view their job status. AM view ACLs. This setting enables the specified users or groups to view the status of the AM and all DAGs that run within the AM. Format: a comma-separated list of users, a white space, and then a comma-separated list of groups. For example, "lhale,msmith administrators,users" If you want to allow all users have visibility of all Tez jobs, set tez.am.view-acls=* Services to be restarted to apply configuration: Tez, Hive, Oozie. This Article assumes the following: 1. Environment is Ambari Managed Cluster 2. Tez View is enabled.

smayani · ‎12-22-2016

Steps: Oozie server timezone For Ambari users, login and navigate to Custom Oozie-site. Ambari > Oozie > configs > custom oozie-site Add Property "oozie.processing.timezone=GMT-0500" oozie.processing.timezone Default value=UTC Oozie server timezone. Valid values are UTC and GMT(+/-)####, for example 'GMT+0530' would be India timezone. All dates parsed and generated dates by Oozie Coordinator/Bundle will be done in the specified timezone. The default value of 'UTC' should not be changed under normal circumstances. If for any reason is changed, note that GMT(+/-)#### timezones do not observe DST changes. Save and Restart Oozie Service Oozie Web Console To View the Job in EST Timezone on Oozie Web Console, follow below steps. Open Oozie Web Console on http://<oozieUrl>:11000/oozie/ Navigate to "Settings" Tab Select from dropdown menu Timezone: EST Navigate to "* Jobs" Tab and refresh to see the jobs in EST timezone Oozie Coordinator properties, use the time in EST and append "-500" to it. start="2016-12-22T15:46-0500" end="2016-12-22T18:00-0500"

smayani · ‎10-20-2016

I faced the following error where the shell script is running over 24 hours, and failing to launch hive scripts after 24 hours with following error. Workaround: Increase the following Property Value in hive-site.xml and restart Hive Metastore to the desired need of oozie shell script to continue running. hive.cluster.delegation.token.renew-interval (Default: 86400000 i.e. 24hrs) hive.cluster.delegation.token.max-lifetime (Default: 604800000 i.e. 7days) yarn application log Stdoutput 16/10/16 17:12:00 [main]: WARN hive.metastore: Failed to connect to the MetaStore Server... Stdoutput org.apache.thrift.transport.TTransportException: Peer indicated failure: DIGEST-MD5: IO error acquiring password Hive MetaStore Error: ERROR [pool-5-thread-198]: transport.TSaslTransport (TSaslTransport.java:open(315)) - SASL negotiation failure javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password [Caused by org.apache.hadoop.security.token.SecretManager$InvalidToken: token expired or does not exist: owner=user, renewer=oozie, realUser=oozie/oozie.host.name@EXAMPLE.COM, issueDate=1476560270232, maxDate=1477165070232, sequenceNumber=51, masterKeyId=714]

smayani · ‎10-17-2016

Additional Notes: If you are using Hive View, you have to adjust additional configurations via Ambari.

smayani · ‎10-04-2016

@Matt Burgess 2 things Resolved the issue: 1. start with the "jdbc:hive2" prefix jdbc:hive2://host.name.net:10000/;principal=hive/_HOST@EXAMPLE.COM 2. Add following property to hive-site.xml that is passed under HiveConnectionPool "Hive Configuration Resources" property. <property> <name>hadoop.security.authentication</name> <value>kerberos</value> </property>

Online	Offline
Last Visited	‎05-15-2018 07:39 PM

Member Since	‎09-25-2015 05:14 PM
Last Visited	‎05-15-2018 07:39 PM
Posts	109
Kudos received	33

Cloudera Community

Re: yarn.nodemanager.resource.cpu-vcores is by def...

Re: default queue accepting one mr job for hive wh...

Re: How to move all the regions of a region server...

Re: Memory Utilization is high unable to find what...

Re: In Kerberos setting, in a HDP, how to confirm ...

Re: Spark job keeps on running even after killing ...

Re: what is mapreduce.task.io.sort.mb relate to?

hive Insert to Dynamic Partition query Generating ...

Re: oozie hive workflow instances (mapreduce tasks...

Re: Ambari Views REST API Overview

Enable the specified users or groups to Tez view w...

How to change Oozie timezone from GMT to EST

Re: Oozie shell action - Run Hive(TEZ) query in sh...

Re: Changing dfs.nameservices value after HDFS HA ...

Re: NiFi Cannot create JDBC driver of class 'org.a...