About dineshc

dineshc · ‎09-29-2017

@Sindhu - Would you be able to help with this ? TIA

dineshc · ‎09-29-2017

My use case is such that there are concurrent jobs which end up firing INSERT INTO queries to same table. As such my namenode logs are getting filled with HDFS StateChange WARN logs. I do not want to enable ACID as it does not support External Tables. Can we only enable concurrency support by setting following properties while still keeping ACID disabled ? hive.support.concurrency = true hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager hive.enforce.bucketing = true hive.exec.dynamic.partition.mode = nonstrict The default/current values are hive.support.concurrency = false hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager hive.enforce.bucketing = true hive.exec.dynamic.partition.mode = nonstrict

dineshc · ‎09-18-2017

Scenario: The spark log4j properties(Ambari > Spark > Configs) are not configured to log to a file. When running a job in yarn-client mode, the driver logs are spilled on the console. For long running jobs, it can be difficult to capture the driver logs due to various reasons like the user may lose connection with the terminal, or may have closed the terminal etc. The driver log is a useful artifact if we have to investigate a job failure. In such scenarios, it is better to have the spark driver log to a file instead of console. Here are the steps: Place a driver_log4j.properties file in a certain location (say /tmp) on the machine where you will be submitting the job in yarn-client mode Contents of driver_log4j.properties #Set everything to be logged to the file log4j.rootCategory=INFO,FILE log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.err log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n log4j.appender.FILE=org.apache.log4j.RollingFileAppender log4j.appender.FILE.File=/tmp/SparkDriver.log log4j.appender.FILE.ImmediateFlush=true log4j.appender.FILE.Threshold=debug log4j.appender.FILE.Append=true log4j.appender.FILE.MaxFileSize=500MB log4j.appender.FILE.MaxBackupIndex=10 log4j.appender.FILE.layout=org.apache.log4j.PatternLayout log4j.appender.FILE.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n #Settings to quiet third party logs that are too verbose log4j.logger.org.eclipse.jetty=WARN log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO Change the value of log4j.appender.FILE.File as needed. 2. Add the following to the spark-submit command so that it picks the above log4j properties and makes the driver log to a file. --driver-java-options "-Dlog4j.configuration=file:/tmp/driver_log4j.properties" Example spark-submit --driver-java-options "-Dlog4j.configuration=file:/tmp/driver_log4j.properties" --class org.apache.spark.examples.JavaSparkPi --master yarn-client --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 spark-examples*.jar 10 3. Now, once you submit this new command, spark driver will log at the location specified by log4j.appender.FILE.File in driver_log4j.properties. Thus, it will log to /tmp/SparkDriver.log Note: The Executor logs can always be fetched from Spark History Server UI whether you are running the job in yarn-client or yarn-cluster mode. a.Go to Spark History Server UI b.Click on the App ID c.Navigate to Executors tab d.The Executors page will list the link to stdout and stderr logs

dineshc · ‎09-15-2017

The error message clearly explains that your subquery does not have the field ssn. Basically this implies that when you are doing a UNION ALL operation, the interpreter expects the same schema from all the queries (meaning the same number, type, name of columns). Thus even though you have casted null to match ssn's datatype bigint, you must give it an alias as shown below. Select*from(select driverid, name, ssn from drivers where driverid<15 UNION ALL Select driverid,name, cast(nullas bigint) as ssn from drivers where driverid BETWEEN 18 AND 21) T

dineshc · ‎09-15-2017

The parent article discusses how to access the ams-hbase instance using Phoenix client when it is a standalone environment. This article is an extension to that as I list the steps to access ams-hbase instance using Phoenix client when Zookeeper is installed on the cluster. As described in the parent article, login to AMS Collector host machine: 1. Check the /etc/ams-hbase/conf/hbase-site.xml 2. From the above file we pick hbase.zookeeper.quorum , hbase.zookeeper.property.clientPort and zookeeper.znode.parent values 3. Goto cd /usr/lib/ambari-metrics-collector/bin 4. Invoke the client with the connection url as shown below by substituting the values: ./sqlline.py hbase.zookeeper.quorum : hbase.zookeeper.property.clientPort 😕 zookeeper.znode.parent Example ./sqlline.py myzkqrm.com:61181:/ams-hbase-secure

dineshc · ‎09-14-2017

Often the namenode log files grow in size and they have too many kinds of messages. One of the most commonly faced scenario is where there had been multiple state changes in hdfs and investigating them becomes a pain when there are multiple occurrence in huge log files. Luckily there is a very easy way to make a few configuration changes to ensure that state change log statements get logged to a separate file. To isolate and log state change log messages to another file, add the following to hdfs-log4j and restart the namenodes. You can make this change from Ambari. Ambari > HDFS service > Configs tab > Advanced tab > Advanced hdfs-log4j section > hdfs-log4j template # StateChange log log4j.logger.org.apache.hadoop.hdfs.StateChange=INFO,SCL log4j.additivity.org.apache.hadoop.hdfs.StateChange=false log4j.appender.SCL=org.apache.log4j.RollingFileAppender log4j.appender.SCL.File=${hadoop.log.dir}/hdfs-state-change.log log4j.appender.SCL.MaxFileSize=256MB log4j.appender.SCL.MaxBackupIndex=20 log4j.appender.SCL.layout=org.apache.log4j.PatternLayout log4j.appender.SCL.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n In this way, HDFS StateChange log messages will be written in ${hadoop.log.dir}/hdfs-state-change.log

dineshc · ‎09-13-2017

After Kerberos has been enabled, I was not able to open Hive View from Ambari. I would get the following error message: Issue detected Service 'userhome' check failed: Usernames not matched: name=root != expected=ambari-server-<clusterName> Service 'userhome' check failed: java.io.IOException: Usernames not matched: name=root != expected=ambari-server-<clusterName> at sun.reflect.GeneratedConstructorAccessor248.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) Root cause Ambari server is running as root so it tries to authenticate with a proxy user as 'root' where as the ambari.service.keytab expects a principal as ambari-server-<clusterName>@REALM. Hence the mismatch. Solution 1. Edit the view settings Go to Edit View page on Ambari : Manage Ambari > Views > Hive > Hive View Or simply: http://<ambariHost:port>/views/ADMIN_VIEW/<ambari.version>/INSTANCE/#/views/HIVE/versions/<view.version>/instances/AUTO_HIVE_INSTANCE/edit Substitute the values of <ambariHost:port> , <ambari.version> , <view.version> as needed for example: http://my.ambari.com:8080/views/ADMIN_VIEW/2.5.2.0/INSTANCE/#/views/HIVE/versions/1.5.0/instances/AUTO_HIVE_INSTANCE/edit Under Settings section: Update the value: WebHDFS Authentication to auth=KERBEROS;proxyuser=ambari-server-<clusterName> Save the changes. 2. Update configs: Navigate to Hive and YARN Configs in Ambari UI and change as below and restart respective services. <AMBARI_SERVER_PRINCIPAL_USER> should be replaced by ambari-server-<clusterName> A) Custom webhcat-site webhcat.proxyuser.<AMBARI_SERVER_PRINCIPAL_USER>.groups=* webhcat.proxyuser.<AMBARI_SERVER_PRINCIPAL_USER>.hosts=* B) Custom yarn-site yarn.timeline-service.http-authentication.<AMBARI_SERVER_PRINCIPAL_USER>.groups=* yarn.timeline-service.http-authentication.<AMBARI_SERVER_PRINCIPAL_USER>.hosts=* yarn.timeline-service.http-authentication.<AMBARI_SERVER_PRINCIPAL_USER>.users=*

dineshc · ‎09-06-2017

@Jonathan Hurley thanks for the inputs! Solved it the hard way with help from Amar but you are spot on with your guidance here!

dineshc · ‎09-06-2017

Upgrading Ambari 2.5.0.3 to Ambari 2.5.2.0 it fails with following stack trace: Ambari Logs: 05 Sep 2017 16:58:54,928 ERROR [main] AlertDefinitionFactory:199 - Unable to deserialize the alert definition source during coercion com.google.gson.JsonSyntaxException: Expecting number, got: STRING at com.google.gson.internal.bind.TypeAdapters$11.read(TypeAdapters.java:304) at com.google.gson.internal.bind.TypeAdapters$11.read(TypeAdapters.java:293) at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$1.read(ReflectiveTypeAdapterFactory.java:93) at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:172) at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$1.read(ReflectiveTypeAdapterFactory.java:93) at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:172) at com.google.gson.Gson.fromJson(Gson.java:795) at com.google.gson.Gson.fromJson(Gson.java:859) at com.google.gson.Gson$2.deserialize(Gson.java:131) at org.apache.ambari.server.state.alert.AlertDefinitionFactory$AlertDefinitionSourceAdapter.deserialize(AlertDefinitionFactory.java:373) at org.apache.ambari.server.state.alert.AlertDefinitionFactory$AlertDefinitionSourceAdapter.deserialize(AlertDefinitionFactory.java:313) at com.google.gson.TreeTypeAdapter.read(TreeTypeAdapter.java:58) at com.google.gson.Gson.fromJson(Gson.java:795) at com.google.gson.Gson.fromJson(Gson.java:761) at com.google.gson.Gson.fromJson(Gson.java:710) at com.google.gson.Gson.fromJson(Gson.java:682) at org.apache.ambari.server.state.alert.AlertDefinitionFactory.coerce(AlertDefinitionFactory.java:196) at org.apache.ambari.server.api.services.AmbariMetaInfo.reconcileAlertDefinitions(AmbariMetaInfo.java:1164) at org.apache.ambari.server.upgrade.UpdateAlertScriptPaths.executeDMLUpdates(UpdateAlertScriptPaths.java:46) at org.apache.ambari.server.upgrade.AbstractUpgradeCatalog.upgradeData(AbstractUpgradeCatalog.java:940) at org.apache.ambari.server.upgrade.SchemaUpgradeHelper.executeDMLUpdates(SchemaUpgradeHelper.java:240) at org.apache.ambari.server.upgrade.SchemaUpgradeHelper.main(SchemaUpgradeHelper.java:433) 05 Sep 2017 16:58:54,929 DEBUG [main] AmbariMetaInfo:1172 - The alert named yarn_app_timeline_server_webui has been modified from the stack definition and will not be merged 05 Sep 2017 16:58:54,929 ERROR [main] SchemaUpgradeHelper:242 - Upgrade failed. java.lang.NullPointerException at org.apache.ambari.server.api.services.AmbariMetaInfo.reconcileAlertDefinitions(AmbariMetaInfo.java:1177) at org.apache.ambari.server.upgrade.UpdateAlertScriptPaths.executeDMLUpdates(UpdateAlertScriptPaths.java:46) at org.apache.ambari.server.upgrade.AbstractUpgradeCatalog.upgradeData(AbstractUpgradeCatalog.java:940) at org.apache.ambari.server.upgrade.SchemaUpgradeHelper.executeDMLUpdates(SchemaUpgradeHelper.java:240) at org.apache.ambari.server.upgrade.SchemaUpgradeHelper.main(SchemaUpgradeHelper.java:433) 05 Sep 2017 16:58:54,929 ERROR [main] SchemaUpgradeHelper:446 - Exception occurred during upgrade, failed org.apache.ambari.server.AmbariException at org.apache.ambari.server.upgrade.SchemaUpgradeHelper.executeDMLUpdates(SchemaUpgradeHelper.java:243) at org.apache.ambari.server.upgrade.SchemaUpgradeHelper.main(SchemaUpgradeHelper.java:433) Caused by: java.lang.NullPointerException at org.apache.ambari.server.api.services.AmbariMetaInfo.reconcileAlertDefinitions(AmbariMetaInfo.java:1177) at org.apache.ambari.server.upgrade.UpdateAlertScriptPaths.executeDMLUpdates(UpdateAlertScriptPaths.java:46) at org.apache.ambari.server.upgrade.AbstractUpgradeCatalog.upgradeData(AbstractUpgradeCatalog.java:940) at org.apache.ambari.server.upgrade.SchemaUpgradeHelper.executeDMLUpdates(SchemaUpgradeHelper.java:240) =================================================================================== On investigation, I found the columns in alert_definition table in ambari database were ordered differently in 2.5.2.0 and 2.5.0.3 I made changes to the table schema and corrected the column order to match the schema in 2.5.2.0, however, I still get the same error. Appreciate any hints.

dineshc · ‎08-31-2017

@Sakina MIrzaDo you mean you want to know how to write a custom partitioner in MapReduce program ? You can follow this link. https://hadooptutorial.wikispaces.com/Custom+partitioner Also, kindly post your questions with at least some description of what you are looking for, this will ensure you get right answers.

Online	Offline
Last Visited	‎12-08-2021 02:51 PM

Member Since	‎10-04-2016 05:35 PM
Last Visited	‎12-08-2021 02:51 PM
Posts	243
Kudos received	276

Cloudera Community

Re: Hortonworks HDPCA Practice Exam V3 Task.

Re: Spark 1.6 - Dataframe read json throws org.apa...

Re: Service 'webhcat' check failed: RA080 Can't de...

Re: Unable to see HDFS metrics in Grafana

Re: Spark sort by key with descending order

Re: Hive : Can we enable concurrency support witho...

Hive : Can we enable concurrency support without e...

How to : capture Spark Driver and Executor Logs in...

Re: Hive - Using UNION - Casting a Column value to...

Accessing AMS Hbase using Phoenix client when Zook...

How to automatically redirect HDFS StateChange Log...

Hive View-Issues detected Service 'userhome' check...

Re: Ambari Upgrade Failure : AlertDefinitionFactor...

Ambari Upgrade Failure : AlertDefinitionFactory - ...

Re: How to do partitioning in MapReduce??