Member since
02-08-2016
793
Posts
669
Kudos Received
85
Solutions
12-26-2016
06:15 PM
2 Kudos
PROBLEM STATEMENT: After finishing upgrade of all Hadoop components, I issued su - hdfs -c "hdfs dfsadmin -finalizeUpgrade" on active Namenode. However, NN UI shows all is good, until I failover to standby and standby becomes active - then again I got message that I should finalize my upgrade.
ERROR: 2015-10-27 18:16:04,954 - Task. Type: EXECUTE, Script: scripts/namenode.py - Function: prepare_non_rolling_upgrade
2015-10-27 18:16:05,044 - Preparing the NameNodes for a NonRolling (aka Express) Upgrade.
2015-10-27 18:16:05,045 - checked_call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -s '"'"'http://ambari.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpKcaPrw 2>/tmp/tmpPjCQIr''] {'quiet': False}
2015-10-27 18:16:05,068 - checked_call returned (0, '')
2015-10-27 18:16:05,069 - checked_call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -s '"'"'http://ambari.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpPIEyeS 2>/tmp/tmpOOXCrc''] {'quiet': False}
2015-10-27 18:16:05,092 - checked_call returned (0, '')
2015-10-27 18:16:05,093 - NameNode High Availability is enabled and this is the Active NameNode.
2015-10-27 18:16:05,093 - Enter SafeMode if not already in it.
2015-10-27 18:16:05,094 - Checkpoint the current namespace.
2015-10-27 18:16:05,094 - Backup the NameNode name directory's CURRENT folder.
2015-10-27 18:16:05,100 - Execute[('cp', '-ar', '/hadoop/hdfs/namenode/current', '/tmp/upgrades/2.2/namenode_ida8c06540_date162715_1/')] {'sudo': True}
2015-10-27 18:16:05,113 - Attempt to Finalize if there are any in-progress upgrades. This will return 255 if no upgrades are in progress.
2015-10-27 18:16:05,113 - checked_call['/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -rollingUpgrade finalize'] {'logoutput': True, 'user': 'hdfs'}
FINALIZE rolling upgrade ...
There is no rolling upgrade in progress or rolling upgrade has already been finalized.
2015-10-27 18:16:08,213 - checked_call returned (0, 'FINALIZE rolling upgrade ...\nThere is no rolling upgrade in progress or rolling upgrade has already been finalized.')
ROOT CAUSE: This is a BUG in Ambari - https://hortonworks.jira.com/browse/BUG-46853 [Hortonworks internal BUG URL for reference]
RESOLUTION: The above bug has taken care in Ambari2.2.0 version. Upgrading to Ambari2.2.0 will resolved the issue.
... View more
Labels:
12-26-2016
05:54 PM
2 Kudos
SYMPTOM: Installing Ranger Admin with Ambari 2.2.1, I got the following exception during the installation Ranger Admin installation failed - xa_audit_db_postgres.sql DB schema import failed! ERROR: ###
2016-05-10 10:25:47,952 - Execute['python /usr/hdp/current/ranger-admin/db_setup.py'] {'logoutput': True, 'environment': {'RANGER_ADMIN_HOME': '/usr/hdp/current/ranger-admin', 'JAVA_HOME': '/usr/lib/jvm/jre-1.7.0-openjdk.x86_64'}, 'user': 'ranger'}
2016-05-10 10:25:48,081 [I] DB FLAVOR :POSTGRES
2016-05-10 10:25:48,081 [I] --------- Verifying Ranger DB connection ---------
2016-05-10 10:25:48,081 [I] Checking connection
2016-05-10 10:25:48,332 [I] connection success
2016-05-10 10:25:48,332 [I] --------- Verifying Ranger DB tables ---------
2016-05-10 10:25:48,332 [I] Verifying table x_portal_user in database ranger
2016-05-10 10:25:48,583 [I] Table x_portal_user already exists in database ranger
2016-05-10 10:25:48,583 [I] --------- Verifying upgrade history table ---------
2016-05-10 10:25:48,583 [I] Verifying table x_db_version_h in database ranger
2016-05-10 10:25:48,834 [I] Table x_db_version_h already exists in database ranger
2016-05-10 10:25:48,834 [I] --------- Applying Ranger DB patches ---------
2016-05-10 10:25:48,834 [I] No patches to apply!
2016-05-10 10:25:48,835 [I] --------- Starting Audit Operation ---------
2016-05-10 10:25:48,835 [I] --------- Check admin user connection ---------
2016-05-10 10:25:48,835 [I] Checking connection
2016-05-10 10:25:49,080 [I] connection success
2016-05-10 10:25:49,081 [I] --------- Check audit user connection ---------
2016-05-10 10:25:49,081 [I] Checking connection
2016-05-10 10:25:49,327 [I] connection success
2016-05-10 10:25:49,327 [I] --------- Check table ---------
2016-05-10 10:25:49,327 [I] Verifying table xa_access_audit in database audit
2016-05-10 10:25:49,575 [I] Table xa_access_audit does not exist in database audit
2016-05-10 10:25:49,576 [I] Importing db schema to database audit from file: xa_audit_db_postgres.sql
SQLException : SQL state: 3F000 org.postgresql.util.PSQLException: ERROR: no schema has been selected to create in ErrorCode: 0
SQLException : SQL state: 3F000 org.postgresql.util.PSQLException: ERROR: no schema has been selected to create in ErrorCode: 0
2016-05-10 10:25:49,813 [E] xa_audit_db_postgres.sql DB schema import failed!
###
ROOT CAUSE: Customer has set below parameter to "NO" , since user has already created the database and the database users and properly configured the connection of the users through the pg_hba.conf. Setup Database and Database User=No
RESOLUTION: Setting up below value to YES and re-ran the installation resolved the issue Setup Database and Database User=Yes
... View more
Labels:
12-25-2016
08:54 PM
@Robert Levas Can you add "ambari-server restart" before Storing the KDC Administrator's Credential? Correct me if i am wrong.
... View more
12-25-2016
05:59 PM
3 Kudos
SYMPTOM: After Upgrading ambari from 1.7.0 to ambari 2.2.1.1 there are lots of alerts with respect to HIVE ALERTS Example: ExecuteTimeoutException: Execution of 'ambari-sudo.sh su ambari-qa -l -s /bin/bash -c 'export PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/bin:/usr/sbin:/usr/bin:/var/lib/ambari-agent:/bin/:/usr/bin/:/usr/sbin/:/usr/lib/hive/bin'"'"' ; export HIVE_CONF_DIR='"'"'/etc/hive/conf.server'"'"' ; hive --hiveconf hive.metastore.uris=thrift://host1:9083 --hiveconf hive.metastore.client.connect.retry.delay=1 --hiveconf hive.metastore.failure.retries=1 --hiveconf hive.metastore.connect.retries=1 --hiveconf hive.metastore.client.socket.timeout=14 --hiveconf hive.execution.engine=mr -e '"'"'show databases;'"'"''' was killed due timeout after 60 seconds
)
2016-05-11 03:25:04,779 [CRITICAL] [HIVE] [hive_server_process] (HiveServer2 Process) Connection failed on host host1:10000 (Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/alerts/alert_hive_thrift_port.py", line 200, in execute
check_command_timeout=int(check_command_timeout))
File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/hive_check.py", line 68, in check_thrift_port_sasl
timeout=check_command_timeout
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 158, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 121, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
tries=tries, try_sleep=try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 285, in _call
raise ExecuteTimeoutException(err_msg)
ExecuteTimeoutException: Execution of 'ambari-sudo.sh su ambari-qa -l -s /bin/bash -c 'export PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/bin:/usr/sbin:/usr/bin:/var/lib/ambari-agent:/bin/:/usr/bin/:/usr/lib/hive/bin/:/usr/sbin/'"'"' ; ! beeline -u '"'"'jdbc:hive2://host1:10000/;transportMode=binary'"'"' -e '"'"''"'"' 2>&1| awk '"'"'{print}'"'"'|grep -i -e '"'"'Connection refused'"'"' -e '"'"'Invalid URL'"'"''' was killed due timeout after 60 seconds
)
2016-05-11 03:34:01,826 [OK] [HIVE] [hive_metastore_process] (Hive Metastore Process) Metastore OK - Hive command took 4.830s
2016-05-11 03:34:01,826 [OK] [HIVE] [hive_server_process] (HiveServer2 Process) TCP OK - 1.549s response on port 10000
ROOT CAUSE: Hive connection was taking long time to respond back. This is suspected to be a bug - https://hortonworks.jira.com/browse/BUG-47724 RESOLUTION: Workaround is to modified the value for "check.command.timeout" HIVE metastore alert definition. Please check the link for detailed steps - https://community.hortonworks.com/articles/33564/how-to-modify-ambari-alert-using-postput-action.html From -
"value" : "60.0"
To -
"value" : "120.0"
... View more
Labels:
12-25-2016
04:28 PM
Problem Statement: Customer has incorporated the use of ACL's within HDFS to control authorisation on the directories and files. They have also changed the fs.permissions.umask-mode = 007 in Ambari under Advanced settings (hdfs-site.xml) file. The ACL's seems to be working correctly when making directories using the hadoop fs -mkdir command. However, when making a directory using the Hue File Browser, the umask permissions are not being set properly according the the umask property set. Using the hadoop fs -mkdir command, folders are being created with a group mask:rwx, files with a group mask:rw-. Through the Hue file browser, folder group mask:r-x, file group: r-x. There seems to be a discrepancy between the mask properties set on folders and files when using hadoop fs mkdir command vs Hue file browser make directory and file command. Why does this discrepency exist? What do customer need to do in order to enforce Hue to follow the same umask and acl permissions set that the hadoop fs commands are following? Hue does not respect Support dfs.umaskmode, fs.permissions.umask-mode when creating files or folders ROOT CAUSE: This is because the WebHDFS API does not read the fs.permissions.umask-mode property, instead it uses whatever value is explicitly passed by Hue or the NN default. This is a BUG - https://hortonworks.jira.com/browse/BUG-38607 RESOLUTION: Upgrading the HDP 2.3 resolved the issue.
... View more
Labels:
12-25-2016
11:37 AM
3 Kudos
SYMPTOM: Cluster was upgraded to 2.3 After the upgrade oozie has configuration issues.
User has workflow defined to create job files in directory /tmp/hadoop-${user.name}/job_details
but instead, the directory is getting created in / and permission denied error is thrown ERROR: Sample workflow: <workflow-app xmlns='uri:oozie:workflow:0.5' name='scisit_all_oozie_workflow'>
<global>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>${runtime}/runtime_params.xml</job-xml>
<job-xml>scisit_all_tables_config.xml</job-xml>
<job-xml>ColumnTransformationRules.xml</job-xml>
<job-xml>HeadersAndTrailers.xml</job-xml>
<configuration>
<property>
<name>oozie.use.system.libpath</name>
<value>true</value>
</property>
<property>
<name>oozie.action.sharelib.for.java</name>
<value>hive</value>
</property>
<property>
<name>mapreduce.map.maxattempts</name>
<value>1</value>
</property>
<property>
<name>mapreduce.reduce.maxattempts</name>
<value>1</value>
</property>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
<property>
<name>mapreduce.input.fileinputformat.split.maxsize</name>
<value>134217728</value>
</property>
<property>
<name>mapreduce.map.output.compress</name>
<value>true</value>
</property>
<property>
<name>mapreduce.map.output.compress.codec</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
<property>
<name>mapreduce.output.fileoutputformat.compress</name>
<value>true</value>
</property>
<property>
<name>mapreduce.output.fileoutputformat.compress.codec</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
<property>
<name>edmhdpif.hive.warehouse</name>
<value>${hiveWarehouseDataDir}</value>
</property>
<property>
<name>edmhdpif.individual.tableprefix</name>
<value>scisit_all_</value>
</property>
<property>
<name>edmhdpif.cdccolumns</name>
<value>${cdcColumns}</value>
</property>
<property>
<name>edmhdpif.rowcounts.database</name>
<value>${falcon_rowcounts_database}</value>
</property>
<property>
<name>edmhdpif.rowcounts.table</name>
<value>${falcon_rowcounts_table}</value>
</property>
<property>
<name>edmhdpif.rowcounts.partition</name>
<value>${falcon_rowcounts_partitions_java}
</value>
</property>
<property>
<name>edmhdpif.rerun.table</name>
<value>${wf:conf('edmhdpif.rerun.table')}</value>
</property>
<property>
<name>edmhdpif.fixwidth</name>
<value>${fixWidth}</value>
</property>
<property>
<name>edmhdpif.delimiter.framework</name>
<value>${frmDelimiter}</value>
</property>
<property>
<name>edmhdpif.delimiter.data</name>
<value>${dataDelimiter}</value>
</property>
<property>
<name>edmhdpif.hive.outputformat</name>
<value>${fileType}</value>
</property>
</configuration>
</global>
<start to="decision-containervalidator" />
<decision name="decision-containervalidator">
<switch>
<case to="containervalidatorjava">${containerValidatorType=="java"}</case>
<case to="containervalidatorpig">${containerValidatorType=="pig"}</case>
<case to="containervalidatorhive">${containerValidatorType=="hive"}</case>
<default to="rowid" />
</switch>
</decision>
<action name="containervalidatorjava">
<java>
<configuration>
<property>
<name>edmhdpif.input.database</name>
<value>${falcon_input_database}</value>
</property>
<property>
<name>edmhdpif.input.table</name>
<value>${falcon_input_table}</value>
</property>
<property>
<name>edmhdpif.input.partition</name>
<value>${falcon_input_partition_filter_java}</value>
</property>
<property>
<name>edmhdpif.containervalidator.args</name>
<value>${containerValidatorArgs}</value>
</property>
<property>
<name>edmhdpif.output.path</name>
<value>${wf:conf('hadoop.tmp.dir')}/${falcon_containervalidation_table}/${falcon_containervalidation_dated_partition_value_fvds}
</value>
</property>
</configuration>
<main-class>${containerValidatorCodeFile}</main-class>
</java>
<ok to="hive-add-partitions-after-containervalidator" />
<error to="fail" />
</action>
<action name="containervalidatorpig">
<pig>
<configuration>
<property>
<name>edmhdpif.input.database</name>
<value>${falcon_input_database}</value>
</property>
<property>
<name>edmhdpif.input.table</name>
<value>${falcon_input_table}</value>
</property>
<property>
<name>edmhdpif.input.partition</name>
<value>${falcon_input_partition_filter_java}</value>
</property>
<property>
<name>edmhdpif.containervalidator.args</name>
<value>${containerValidatorArgs}</value>
</property>
<property>
<name>edmhdpif.output.path</name>
<value>${wf:conf('hadoop.tmp.dir')}/${falcon_containervalidation_table}/${falcon_containervalidation_dated_partition_value_fvds}
</value>
</property>
</configuration>
<script>${containerValidatorCodeFile}</script>
</pig>
<ok to="hive-add-partitions-after-containervalidator" />
<error to="fail" />
</action>
<action name="containervalidatorhive">
<hive xmlns="uri:oozie:hive-action:0.5">
<job-xml>${wf:appPath()}/conf/hive-site.xml</job-xml>
<job-xml>${wf:appPath()}/conf/tez-site.xml</job-xml>
<configuration>
<property>
<name>edmhdpif.input.database</name>
<value>${falcon_input_database}</value>
</property>
<property>
<name>edmhdpif.input.table</name>
<value>${falcon_input_table}</value>
</property>
<property>
<name>edmhdpif.input.partition</name>
<value>${falcon_input_partition_filter_java}</value>
</property>
<property>
<name>edmhdpif.containervalidator.args</name>
<value>${containerValidatorArgs}</value>
</property>
<property>
<name>edmhdpif.output.path</name>
<value>${wf:conf('hadoop.tmp.dir')}/${falcon_containervalidation_table}/${falcon_containervalidation_dated_partition_value_fvds}
</value>
</property>
</configuration>
<script>${containerValidatorCodeFile}</script>
</hive>
<ok to="hive-add-partitions-after-containervalidator" />
<error to="fail" />
</action>
<action name="hive-add-partitions-after-containervalidator">
<hive xmlns="uri:oozie:hive-action:0.5">
<job-xml>${wf:appPath()}/conf/hive-site.xml</job-xml>
<job-xml>${wf:appPath()}/conf/tez-site.xml</job-xml>
<script>${wf:appPath()}/scisit_all_add_partitions_after_containervalidation.hql
</script>
<param>param_dated_partition_value=${falcon_rowid_dated_partition_value_rds}
</param>
</hive>
<ok to="rowid" />
<error to="fail" />
</action>
<action name="rowid">
<java>
<configuration>
<property>
<name>edmhdpif.input.database</name>
<value>${falcon_input_database}</value>
</property>
<property>
<name>edmhdpif.input.table</name>
<value>${falcon_input_table}</value>
</property>
<property>
<name>edmhdpif.input.partition</name>
<value>${falcon_input_partition_filter_java}</value>
</property>
<property>
<name>edmhdpif.rowid.database</name>
<value>${falcon_rowid_database}</value>
</property>
<property>
<name>edmhdpif.rowid.table</name>
<value>${falcon_rowid_table}</value>
</property>
<property>
<name>edmhdpif.rowid.partition</name>
<value>${falcon_rowid_partitions_java}</value>
</property>
<property>
<name>edmhdpif.rowhistory.database</name>
<value>${falcon_rowhistory_database}</value>
</property>
<property>
<name>edmhdpif.rowhistory.table</name>
<value>${falcon_rowhistory_table}</value>
</property>
<property>
<name>edmhdpif.rowhistory.partition</name>
<value>${falcon_rowhistory_partitions_java}</value>
</property>
<property>
<name>edmhdpif.output.path</name>
<value>${wf:conf('hadoop.tmp.dir')}/${falcon_input_table}/${falcon_rowid_dated_partition_value_rds}
</value>
</property>
<property>
<name>edmhdpif.containervalidator.type</name>
<value>${containerValidatorType}</value>
</property>
</configuration>
<main-class>com.scb.edmhdpif.rowid.RowId</main-class>
</java>
<ok to="hive-add-partitions-after-rowid" />
<error to="fail" />
</action>
<action name="hive-add-partitions-after-rowid">
<hive xmlns="uri:oozie:hive-action:0.5">
<job-xml>${wf:appPath()}/conf/hive-site.xml</job-xml>
<job-xml>${wf:appPath()}/conf/tez-site.xml</job-xml>
<script>${wf:appPath()}/scisit_all_add_partitions_after_rowid.hql
</script>
<param>param_dated_partition_value=${falcon_rowid_dated_partition_value_rds}
</param>
</hive>
<ok to="decision-datatransform" />
<error to="fail" />
</action>
<decision name="decision-datatransform">
<switch>
<case to="datatransform">${dataTransform=="REQUIRED"}</case>
<default to="decision-typevalidator" />
</switch>
</decision>
<action name="datatransform">
<java>
<configuration>
<property>
<name>edmhdpif.input.database</name>
<value>${falcon_rowid_database}</value>
</property>
<property>
<name>edmhdpif.input.table</name>
<value>${falcon_rowid_table}</value>
</property>
<property>
<name>edmhdpif.input.partition</name>
<value>${falcon_rowid_partitions_java}
</value>
</property>
<property>
<name>edmhdpif.datatransform.valid.database</name>
<value>${falcon_datatransformvalid_database}</value>
</property>
<property>
<name>edmhdpif.datatransform.valid.table</name>
<value>${falcon_datatransformvalid_table}</value>
</property>
<property>
<name>edmhdpif.datatransform.valid.partition</name>
<value>${falcon_datatransformvalid_partitions_java}
</value>
</property>
<property>
<name>edmhdpif.datatransform.invalid.database</name>
<value>${falcon_datatransforminvalid_database}</value>
</property>
<property>
<name>edmhdpif.datatransform.invalid.table</name>
<value>${falcon_datatransforminvalid_table}</value>
</property>
<property>
<name>edmhdpif.datatransform.invalid.partition</name>
<value>${falcon_datatransforminvalid_partitions_java}
</value>
</property>
<property>
<name>edmhdpif.output.path</name>
<value>${wf:conf('hadoop.tmp.dir')}/${falcon_rowid_table}/${falcon_rowid_dated_partition_value_rds}
</value>
</property>
<property>
<name>oozie.action.sharelib.for.java</name>
<value>hive,libserver</value>
</property>
</configuration>
<main-class>com.scb.edmhdpif.datatransform.DataTransform</main-class>
</java>
<ok to="hive-add-partitions-after-datatransform" />
<error to="fail" />
</action>
<action name="hive-add-partitions-after-datatransform">
<hive xmlns="uri:oozie:hive-action:0.5">
<job-xml>${wf:appPath()}/conf/hive-site.xml</job-xml>
<job-xml>${wf:appPath()}/conf/tez-site.xml</job-xml>
<script>${wf:appPath()}/scisit_all_add_partitions_after_datatransform.hql
</script>
<param>param_dated_partition_value=${falcon_rowid_dated_partition_value_rds}
</param>
</hive>
<ok to="decision-typevalidator" />
<error to="fail" />
</action>
<decision name="decision-typevalidator">
<switch>
<case to="typevalidatorjava">${typeValidatorType=="java"}</case>
<case to="typevalidatorpig">${typeValidatorType=="pig"}</case>
<case to="typevalidatorhive">${typeValidatorType=="hive"}</case>
<default to="decision-sri" />
</switch>
</decision>
<action name="typevalidatorjava">
<java>
<configuration>
<property>
<name>edmhdpif.input.database</name>
<value>${falcon_datatransformvalid_database}</value>
</property>
<property>
<name>edmhdpif.input.table</name>
<value>${falcon_datatransformvalid_table}</value>
</property>
<property>
<name>edmhdpif.input.partition</name>
<value>${falcon_datatransformvalid_partitions_java}</value>
</property>
<property>
<name>edmhdpif.typevalidator.validtypes.database</name>
<value>${falcon_verify_database}</value>
</property>
<property>
<name>edmhdpif.typevalidator.validtypes.table</name>
<value>${falcon_verify_table}</value>
</property>
<property>
<name>edmhdpif.typevalidator.validtypes.partition</name>
<value>${falcon_verify_partitions_java}</value>
</property>
<property>
<name>edmhdpif.typevalidator.invalidtypes.database</name>
<value>${falcon_invalid_database}</value>
</property>
<property>
<name>edmhdpif.typevalidator.invalidtypes.table</name>
<value>${falcon_invalid_table}</value>
</property>
<property>
<name>edmhdpif.typevalidator.invalidtypes.partition</name>
<value>${falcon_invalid_partitions_java}</value>
</property>
<property>
<name>edmhdpif.typevalidator.warntypes.database</name>
<value>${falcon_warn_database}</value>
</property>
<property>
<name>edmhdpif.typevalidator.warntypes.table</name>
<value>${falcon_warn_table}</value>
</property>
<property>
<name>edmhdpif.typevalidator.warntypes.partition</name>
<value>${falcon_warn_partitions_java}</value>
</property>
<property>
<name>edmhdpif.output.path</name>
<value>${wf:conf('hadoop.tmp.dir')}/${falcon_rowid_table}/${falcon_rowid_dated_partition_value_rds}
</value>
</property>
<property>
<name>edmhdpif.typevalidator.onetable</name>
<value>${wf:conf('SRIStep')}</value>
</property>
<property>
<name>edmhdpif.typevalidator.args</name>
<value>${typeValidatorArgs}</value>
</property>
</configuration>
<main-class>${typeValidatorCodeFile}</main-class>
</java>
<ok to="hive-add-partitions-after-typevalidator" />
<error to="fail" />
</action>
<action name="typevalidatorhive">
<hive xmlns="uri:oozie:hive-action:0.5">
<job-xml>${wf:appPath()}/conf/hive-site.xml</job-xml>
<job-xml>${wf:appPath()}/conf/tez-site.xml</job-xml>
<configuration>
<property>
<name>edmhdpif.input.database</name>
<value>${falcon_datatransformvalid_database}</value>
</property>
<property>
<name>edmhdpif.input.table</name>
<value>${falcon_datatransformvalid_table}</value>
</property>
<property>
<name>edmhdpif.input.partition</name>
<value>${falcon_datatransformvalid_partitions_java}</value>
</property>
<property>
<name>edmhdpif.typevalidator.validtypes.database</name>
<value>${falcon_verify_database}</value>
</property>
<property>
<name>edmhdpif.typevalidator.validtypes.table</name>
<value>${falcon_verify_table}</value>
</property>
<property>
<name>edmhdpif.typevalidator.validtypes.partition</name>
<value>${falcon_verify_partitions_java}</value>
</property>
<property>
<name>edmhdpif.typevalidator.invalidtypes.database</name>
<value>${falcon_invalid_database}</value>
</property>
<property>
<name>edmhdpif.typevalidator.invalidtypes.table</name>
<value>${falcon_invalid_table}</value>
</property>
<property>
<name>edmhdpif.typevalidator.invalidtypes.partition</name>
<value>${falcon_invalid_partitions_java}</value>
</property>
<property>
<name>edmhdpif.typevalidator.warntypes.database</name>
<value>${falcon_warn_database}</value>
</property>
<property>
<name>edmhdpif.typevalidator.warntypes.table</name>
<value>${falcon_warn_table}</value>
</property>
<property>
<name>edmhdpif.typevalidator.warntypes.partition</name>
<value>${falcon_warn_partitions_java}</value>
</property>
<property>
<name>edmhdpif.output.path</name>
<value>${wf:conf('hadoop.tmp.dir')}/${falcon_rowid_table}/${falcon_rowid_dated_partition_value_rds}
</value>
</property>
<property>
<name>edmhdpif.typevalidator.onetable</name>
<value>${wf:conf('SRIStep')}</value>
</property>
<property>
<name>edmhdpif.typevalidator.args</name>
<value>${typeValidatorArgs}</value>
</property>
</configuration>
<script>${typeValidatorCodeFile}</script>
</hive>
<ok to="hive-add-partitions-after-typevalidator" />
<error to="fail" />
</action>
<action name="typevalidatorpig">
<pig>
<configuration>
<property>
<name>edmhdpif.input.database</name>
<value>${falcon_datatransformvalid_database}</value>
</property>
<property>
<name>edmhdpif.input.table</name>
<value>${falcon_datatransformvalid_table}</value>
</property>
<property>
<name>edmhdpif.input.partition</name>
<value>${falcon_datatransformvalid_partitions_java}</value>
</property>
<property>
<name>edmhdpif.typevalidator.validtypes.database</name>
<value>${falcon_verify_database}</value>
</property>
<property>
<name>edmhdpif.typevalidator.validtypes.table</name>
<value>${falcon_verify_table}</value>
</property>
<property>
<name>edmhdpif.typevalidator.validtypes.partition</name>
<value>${falcon_verify_partitions_java}</value>
</property>
<property>
<name>edmhdpif.typevalidator.invalidtypes.database</name>
<value>${falcon_invalid_database}</value>
</property>
<property>
<name>edmhdpif.typevalidator.invalidtypes.table</name>
<value>${falcon_invalid_table}</value>
</property>
<property>
<name>edmhdpif.typevalidator.invalidtypes.partition</name>
<value>${falcon_invalid_partitions_java}</value>
</property>
<property>
<name>edmhdpif.typevalidator.warntypes.database</name>
<value>${falcon_warn_database}</value>
</property>
<property>
<name>edmhdpif.typevalidator.warntypes.table</name>
<value>${falcon_warn_table}</value>
</property>
<property>
<name>edmhdpif.typevalidator.warntypes.partition</name>
<value>${falcon_warn_partitions_java}</value>
</property>
<property>
<name>edmhdpif.output.path</name>
<value>${wf:conf('hadoop.tmp.dir')}/${falcon_rowid_table}/${falcon_rowid_dated_partition_value_rds}
</value>
</property>
<property>
<name>edmhdpif.typevalidator.onetable</name>
<value>${wf:conf('SRIStep')}</value>
</property>
<property>
<name>edmhdpif.typevalidator.args</name>
<value>${typeValidatorArgs}</value>
</property>
</configuration>
<script>${typeValidatorCodeFile}</script>
</pig>
<ok to="hive-add-partitions-after-typevalidator" />
<error to="fail" />
</action>
<action name="hive-add-partitions-after-typevalidator">
<hive xmlns="uri:oozie:hive-action:0.5">
<job-xml>${wf:appPath()}/conf/hive-site.xml</job-xml>
<job-xml>${wf:appPath()}/conf/tez-site.xml</job-xml>
<script>${wf:appPath()}/scisit_all_add_partitions_after_typevalidation.hql
</script>
<param>param_dated_partition_value=${falcon_rowid_dated_partition_value_rds}
</param>
</hive>
<ok to="decision-sri" />
<error to="fail" />
</action>
<decision name="decision-sri">
<switch>
<case to="sri">${wf:conf('SRIStep')}</case>
<default to="end" />
</switch>
</decision>
<action name="sri">
<java>
<configuration>
<property>
<name>edmhdpif.input.database</name>
<value>${falcon_verify_database}</value>
</property>
<property>
<name>edmhdpif.input.table</name>
<value>${falcon_verify_table}</value>
</property>
<property>
<name>edmhdpif.input.partition</name>
<value>${falcon_verify_partitions_java}</value>
</property>
<property>
<name>edmhdpif.input.partition.previous</name>
<value>${falcon_verifyprevious_partitions_java}</value>
</property>
<property>
<name>edmhdpif.output.path</name>
<value>${wf:conf('hadoop.tmp.dir')}/${falcon_verify_table}/${falcon_rowid_dated_partition_value_rds}
</value>
</property>
<property>
<name>edmhdpif.open.database</name>
<value>sit_sri_open</value>
</property>
<property>
<name>edmhdpif.open.partition</name>
<value>'ods=${falcon_rowid_dated_partition_value_rds}'
</value>
</property>
<property>
<name>edmhdpif.open.partition.previous</name>
<value>'ods=${falcon_verifyprevious_dated_partition_value_vds}'
</value>
</property>
<property>
<name>edmhdpif.nonopen.database</name>
<value>sit_sri_nonopen</value>
</property>
<property>
<name>edmhdpif.nonopen.partition</name>
<value>'nds=${falcon_rowid_dated_partition_value_rds}'
</value>
</property>
<property>
<name>edmhdpif.duplicatedrows.database</name>
<value>${falcon_duplicates_database}</value>
</property>
<property>
<name>edmhdpif.duplicatedrows.table</name>
<value>${falcon_duplicates_table}</value>
</property>
<property>
<name>edmhdpif.duplicatedrows.partition</name>
<value>${falcon_duplicates_partitions_java}</value>
</property>
</configuration>
<main-class>com.scb.edmhdpif.sri.SRI</main-class>
</java>
<ok to="hive-add-partitions-after-sri" />
<error to="fail" />
</action>
<action name="hive-add-partitions-after-sri">
<hive xmlns="uri:oozie:hive-action:0.5">
<job-xml>${wf:appPath()}/conf/hive-site.xml</job-xml>
<job-xml>${wf:appPath()}/conf/tez-site.xml</job-xml>
<script>${wf:appPath()}/scisit_all_add_partitions_after_sri.hql
</script>
<param>param_dated_partition_value=${falcon_rowid_dated_partition_value_rds}
</param>
</hive>
<ok to="decision-postprocessing" />
<error to="fail" />
</action>
<decision name="decision-postprocessing">
<switch>
<case to="postprocessing">${wf:conf('postProcessingType')=="ebbs"
}
</case>
<default to="end" />
</switch>
</decision>
<action name="postprocessing">
<java>
<main-class>${postProcessingCodeFile}</main-class>
</java>
<ok to="hive-add-partitions-after-postprocessing" />
<error to="fail" />
</action>
<action name="hive-add-partitions-after-postprocessing">
<hive xmlns="uri:oozie:hive-action:0.5">
<script>${wf:appPath()}/scisit_all_add_partitions_after_postprocessing.hql
</script>
<param>param_dated_partition_value=${wf:conf('edmhdpif.sri.nextworkingdate')}
</param>
</hive>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>Java failed, error
message[${wf:errorMessage(wf:lastErrorNode())}]
</message>
</kill>
<end name="end" />
</workflow-app>
Diagnostics:
Job setup failed : org.apache.hadoop.security.AccessControlException: Permission denied: user=sitsciapp, access=WRITE, inode="/scisit_all_verifytypes/2016_03_07/_temporary/1":hdfs:hdfs:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1771)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1755)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1738)
at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:71)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3896)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:984)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:622)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2133)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2131)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:3010)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2978)
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1047)
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1043)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:1043)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:1036)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1877)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:305)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobSetup(CommitterEventHandler.java:254)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:234)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
ROOT CAUSE: Found the /etc/oozie/conf/action-conf/hive.xml was empty(zero size) which was causing the oozie in picking up "hadoop.tmp.dir" variable defined in oozie workflow. RESOLUTION: Coping the hive.xml from backup copy to "/etc/oozie/conf/action-conf/" resolved the issue.
... View more
Labels:
12-25-2016
11:02 AM
3 Kudos
SYMPTOM: - Enabled Ranger Kafka plugin via Ambari and restarted kafka service. - kafka logs still populating with "Ranger Plugin returned null" error
- check ranger logs and could not see any info about kafka policy download
- checked /etc/ranger/test_kafka/policycache/ and the json file in that is empty!
bash-4.1# cd /etc/ranger/test_kafka/
bash-4.1# cd policycache/
bash-4.1# ls -ltr
total 0
-rw-r--r-- 1 kafka hadoop 0 Mar 2 16:00 kafka_test_kafka.json_old
-rw-r--r-- 1 kafka hadoop 0 Mar 16 11:30 kafka_test_kafka.json_old1
-rw-r--r-- 1 kafka hadoop 0 Mar 16 12:27 kafka_test_kafka.json
- checked Test Connection for Kafka repo in Ranger. It was successful. - The Ranger plugin audits, did not have info on the kafka plugin sync.
- Thus the kafka plugin is not being synced in this case. Policy refresh not working.
- Tried deleting the default kafka policy and created a new one however issue still exists.
- Tried to use REST API to get the policy details however no output.
ERROR: 2016-03-02 16:47:34,607 ERROR [kafka-request-handler-6] apache.ranger.authorization.kafka.authorizer.RangerKafkaAuthorizer (RangerKafkaAuthorizer.java:202) - Ranger Plugin returned null. Returning false ROOT CAUSE: Issue was the missing class path /etc/kafka/conf in the kafka-broker process
RESOLUTION: Adding below lines to the Kafka > Advanced kafka-env > kafka-env template config resolved the plugin issue if [ -f /etc/kafka/conf/kafka-ranger-env.sh ]; then
. /etc/kafka/conf/kafka-ranger-env.sh
fi Restart Kafka
... View more
Labels:
12-25-2016
10:46 AM
4 Kudos
SYMPTOM: Recently, we upgraded from Ambari 2.1.2.1 to Ambari 2.2.2.0. Now, we also would like to utilize „Smart Sense“.
Per Ambari status it is running. But when opening the „SmartSense“ View we are receiving following error: ERROR: SmartSense Service Unavailable The SmartSense service is currently unavailable. Please make sure the SmartSense Server is up and running.
ROOT CAUSE: Customer was having problem with DNS registry, hence it was giving issue. RESOLUTION: Using ipaddress instead of hostname resolved the issue.
... View more
Labels:
12-25-2016
10:30 AM
4 Kudos
SYMPTOM Sometimes while performing any Ambari operation like for example, adding a Ranger KMS service or doing any other operation, the Ambari UI (or even curt response) might show this error: ERROR: Error 500 status code received on GET method for API: /api/v1/stacks/HDP/versions/2.3/recommendations
Error message: Error occurred during stack advisor command invocation: Cannot create /var/run/ambari-server/stack-recommendations ROOT CAUSE
This is most likely a permission issue with /var/run/ambari-server/ location for the user who is running the 'ambari-server' process. RESOLUTION
To resolve this, the permission for /var/run/ambari/server should be setup correctly. On the good cluster, where the ambari-server is running as 'root' the permission looks like: [root@test ~]# ll -d /var/run/ambari-server/
drwxr-xr-x 4 root root 4096 Dec 4 05:10 /var/run/ambari-server/
[root@test ~]# ll /var/run/ambari-server/
total 12
-rw-r--r-- 1 root root 6 Dec 4 05:10 ambari-server.pid
drwxr-xr-x 4 root root 4096 Dec 4 05:26 bootstrap
drwxr-xr-x 39 root root 4096 Feb 17 18:14 stack-recommendations
Fix the permissions and try the operations on Ambari dashboard.
... View more
Labels:
12-25-2016
10:13 AM
3 Kudos
PROBLEM:
After upgrading from Ambari from v2.1.2.1 to v2.2.2.0 when attempting to "Re-install" Grafana nothing happened. No task was started and the Ambari UI seemed to hang. ERROR:
The following error messages were found in the Ambari server log file 08 Jul 2016 10:28:38,174 ERROR [qtp-ambari-client-2410] ClusterImpl:2347 - Config inconsistency exists: unknown configType=kerberos-env
08 Jul 2016 10:28:38,174 ERROR [qtp-ambari-client-2410] ClusterImpl:2347 - Config inconsistency exists: unknown configType=krb5-conf
08 Jul 2016 10:28:38,174 ERROR [qtp-ambari-client-2410] ClusterImpl:2347 - Config inconsistency exists: unknown configType=ranger-ugsync-site
08 Jul 2016 10:28:38,174 ERROR [qtp-ambari-client-2410] ClusterImpl:2347 - Config inconsistency exists: unknown configType=admin-properties
08 Jul 2016 10:28:38,174 ERROR [qtp-ambari-client-2410] ClusterImpl:2347 - Config inconsistency exists: unknown configType=usersync-properties
08 Jul 2016 10:28:38,175 ERROR [qtp-ambari-client-2410] ClusterImpl:2347 - Config inconsistency exists: unknown configType=ranger-admin-site
08 Jul 2016 10:28:38,175 ERROR [qtp-ambari-client-2410] ClusterImpl:2347 - Config inconsistency exists: unknown configType=ranger-site SOLUTION:
When Service is deleted, ServiceConfig entities are deleted. ServiceConfig entities have a CASCADE relationship with ClusterConfig and they also go away. This leaves orphaned entries in ClusterConfigMapping. The following two queries proved to clean up the database and the errors subsided. >select ccm.type_name from clusterconfigmapping ccm left join clusterconfig cc on ccm.type_name = cc.type_name where ccm.selected = 1 and cc.type_name is NULL;
>delete from clusterconfigmapping where type_name in (select ccm.type_name from clusterconfigmapping ccm left join clusterconfig cc on ccm.type_name = cc.type_name where ccm.selected = 1 and cc.type_name is NULL);
... View more
Labels: