Member since
04-03-2019
962
Posts
1743
Kudos Received
146
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 17724 | 03-08-2019 06:33 PM | |
| 7164 | 02-15-2019 08:47 PM |
06-24-2017
12:29 AM
@Matt Clarke - Very nicely explained! 🙂
... View more
06-19-2017
07:37 PM
1 Kudo
These steps have been successfully tried and tested on Ambari-2.4.2.0 and HDP-2.5.3. . When you install and configure Solr cloud via Ambari or embedded solr via Ambari Infra on a kerberized cluster, SPNEGO authentication gets enabled by default. There is not direct switch to disable only SPNEGO authentication. . Please follow below method to disable it. . Step 1 - Login to Ambari server . Step 2 - Please take backup of below script /var/lib/ambari-server/resources/common-services/SOLR/<version>/package/scripts/setup_solr_kerberos_auth.py . Step 3 - Edit above script and do the modifications as below Original value:
command += '\'{"authentication":{"class": "org.apache.solr.security.KerberosPlugin"}}\''
Recommended value:
command += '\'{ }\'' . Step 4 - Please replace cached script with below command and restart ambari-agent followed by Solr service(via Ambari) cp /var/lib/ambari-server/resources/mpacks/solr-ambari-mpack-5.5.2.2.5/common-services/SOLR/5.5.2.2.5/package/scripts/setup_solr_kerberos_auth.py /var/lib/ambari-agent/cache/common-services/SOLR/5.5.2.2.5/package/scripts/setup_solr_kerberos_auth.py . You should be able to access Solr Web UI without having kerberos ticket. . Please comment if you have any feedback/questions/suggestions. Happy Hadooping!!
... View more
Labels:
06-02-2017
12:01 AM
3 Kudos
According to default Oozie log4j configuration in Ambari - Log get's rotated by every hour and retention is set to 30 days. . log4j.appender.oozie=org.apache.log4j.rolling.RollingFileAppender
log4j.appender.oozie.RollingPolicy=org.apache.oozie.util.OozieRollingPolicy
log4j.appender.oozie.File=${oozie.log.dir}/oozie.log
log4j.appender.oozie.Append=true
log4j.appender.oozie.layout=org.apache.log4j.PatternLayout
log4j.appender.oozie.layout.ConversionPattern=%d{ISO8601} %5p %c{1}:%L - SERVER[${oozie.instance.id}] %m%n
# The FileNamePattern must end with "-%d{yyyy-MM-dd-HH}.gz" or "-%d{yyyy-MM-dd-HH}" and also start with the
# value of log4j.appender.oozie.File
log4j.appender.oozie.RollingPolicy.FileNamePattern=${log4j.appender.oozie.File}-%d{yyyy-MM-dd-HH}
# The MaxHistory controls how many log files will be retained (720 hours / 24 hours per day = 30 days); -1 to disable
log4j.appender.oozie.RollingPolicy.MaxHistory={{oozie_log_maxhistory}} . If you want to configure DRFA to roll the log file daily, please set below parameters in log4j section of Oozie configuration via Ambari and restart required services. log4j.appender.oozie=org.apache.log4j.DailyRollingFileAppender
log4j.appender.oozie.File=${oozie.log.dir}/oozie.log
log4j.appender.oozie.Append=true
log4j.appender.oozie.layout=org.apache.log4j.PatternLayout
log4j.appender.oozie.layout.ConversionPattern=%d{ISO8601} %5p %c{1}:%L - SERVER[${oozie.instance.id}] %m%n
log4j.appender.oozie.DatePattern='.'yyyy-MM-dd . Please note that DRFA does not support MaxBackupIndex hence if you want retention then you can go with RFA size based rolling and use MaxBackupIndex . Please comment if you have any feedback/questions/suggestions. Happy Hadooping!!
... View more
Labels:
05-19-2017
01:15 AM
1 Kudo
Short Description: How to run sample Oozie sqoop action to get data from Mysql table to HDFS. Article Below are the steps to run sample sqoop action to get data from Mysql table on HDFS. Note - Please refer this to create sample Mysql table with dummy data. . 1. Configure job.properties Example: nameNode=hdfs://<namenode-host>:8020
jobTracker=<rm-host>:8050
queueName=default
examplesRoot=examples
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/${user.name}
oozie.libpat=/user/root . 2. Configure Workflow.xml Example: <?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<workflow-app xmlns="uri:oozie:workflow:0.2" name="sqoop-wf">
<start to="sqoop-node"/>
<action name="sqoop-node">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<command>import --connect jdbc:mysql://<mysql-server-hostname>:3306/<database-name> --username <mysql-database-username> --table <table-name> --driver com.mysql.jdbc.Driver --m 1</command>
</sqoop>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Sqoop failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app> . 3. Upload workflow.xml and shell script to "oozie.wf.application.path" defined in job.properties . 4. Follow below command to run Oozie workflow oozie job -oozie http://<oozie-server-hostname>:11000/oozie -config /$PATH/job.properties -run . Please comment if you have any question! Happy Hadooping!! 🙂
... View more
Labels:
05-15-2017
04:27 AM
1 Kudo
Below are the steps for Oozie database migration from Derby to Postgresql. . Step 1 - Have Postgresql server installed and ready to be configured. . Step 2 - Stop Oozie service from Ambari UI. . Step 3 - Install Postgresql JDBC connector. yum install postgresql-jdbc . Step 4 - On Ambari Server, run below command ambari-server setup --jdbc-db=postgres --jdbc-driver=/usr/share/java/postgresql-jdbc.jar Note - Please pass appropriate driver if /usr/share/java/postgresql-jdbc.jar does not exists. . . Step 5 - Login to Postgesql DB as postgres user and create a blank 'oozie' database and grant required permissions to the 'oozie' user. [root@ambaview ~]# su - postgres
-bash-4.1$ psql
psql (8.4.20)
Type "help" for help.
— postgres=# CREATE DATABASE oozie;
CREATE DATABASE
postgres=#
— CREATE USER oozie WITH PASSWORD 'oozie’;
postgres=# CREATE USER oozie WITH PASSWORD 'oozie';
CREATE ROLE
postgres=#
— GRANT ALL PRIVILEGES ON DATABASE oozie TO oozie;
postgres=# GRANT ALL PRIVILEGES ON DATABASE oozie TO oozie;
GRANT
postgres=# . Step 6 - Add Oozie Server IP address and 'oozie' user information to pg_hba configuration file and restart postgresql service. host "oozie oozie 17X.2X.X9.2X0/0 md5" to /var/lib/pgsql/data/pg_hba.conf
[root@ambaview ~]# service postgresql restart
Stopping postgresql service: [ OK ]
Starting postgresql service: [ OK ] . Step 7 - Add postgres database server details in Oozie configuration via Ambari UI. . Step 8 - Copy postgresql-jdbc.jar to Oozie's libext directory. cp /usr/share/java/postgresql-jdbc.jar /usr/hdp/<hdp-version>/oozie/libext/ . Step 9 - Prepare Oozie war file /usr/hdp/<version>/oozie/bin/oozie-setup.sh prepare-war
Note - Run above command on oozie server as oozie user.
. Step 10 - Prepare Oozie schema using below command (Run below command on Oozie host as oozie user) /usr/hdp/<version>/oozie/bin/oozie-setup.sh db create -run . Step 11 - Start Oozie server via Ambari. . Happy Hadooping!! Please comment your feedback or questions in the comment section.
... View more
Labels:
03-06-2017
08:38 PM
@Georg Heiler - Yes. Please use refer below curl command for the same curl -H "X-Requested-By: ambari" -X GET-u <admin-user>:<admin-password> http://<ambari-server>:8080/api/v1/clusters/<cluster-name>?format=blueprint
... View more
03-06-2017
08:31 PM
2 Kudos
In previous post we have seen how to Automate HDP installation with Kerberos authentication on multi node cluster using Ambari Blueprints. In this post, we will see how to deploy multi-node node HDP Cluster with Resource Manager HA via Ambari blueprint. . Below are simple steps to install HDP multi node cluster with Resource Manager HA using internal repository via Ambari Blueprints. . Note - For Ambari 2.6.X onwards, we will have to register VDF to register internal repository, or else Ambari will pick up latest version of HDP and use the public repos. please see below document for more information. For Ambari version less than 2.6.X, this guide will work without any modifications. Document - https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.0.0/bk_ambari-release-notes/content/ambari_relnotes-2.6.0.0-behavioral-changes.html . Step 1: Install Ambari server using steps mentioned under below link http://docs.hortonworks.com/HDPDocuments/Ambari-2.4.2.0/bk_ambari-installation/content/ch_Installing_Ambari.html . Step 2: Register ambari-agent manually Install ambari-agent package on all the nodes in the cluster and modify hostname to ambari server host(fqdn) in /etc/ambari-agent/conf/ambari-agent.ini . Step 3: Configure blueprints Please follow below steps to create Blueprints . 3.1 Create hostmap.json(cluster creation template) file as shown below: Note – This file will have information related to all the hosts which are part of your HDP cluster. This is also called as cluster is creation template as per Apache Ambari documentation. {
"blueprint" : "hdptest",
"default_password" : "hadoop",
"host_groups" :[
{
"name" : "blueprint1",
"hosts" : [
{
"fqdn" : "blueprint1.crazyadmins.com"
}
]
},
{
"name" : "blueprint2",
"hosts" : [
{
"fqdn" : "blueprint2.crazyadmins.com"
}
]
},
{
"name" : "blueprint3",
"hosts" : [
{
"fqdn" : "blueprint3.crazyadmins.com"
}
]
}
]
}
. 3.2 Create cluster_config.json(blueprint) file, it contents mapping of hosts to HDP components {
"configurations" : [
{
"core-site": {
"properties" : {
"fs.defaultFS" : "hdfs://%HOSTGROUP::blueprint1%:8020"
}}
},{
"yarn-site" : {
"properties" : {
"hadoop.registry.rm.enabled" : "false",
"hadoop.registry.zk.quorum" : "%HOSTGROUP::blueprint3%:2181,%HOSTGROUP::blueprint2%:2181,%HOSTGROUP::blueprint1%:2181",
"yarn.log.server.url" : "http://%HOSTGROUP::blueprint3%:19888/jobhistory/logs",
"yarn.resourcemanager.address" : "%HOSTGROUP::blueprint2%:8050",
"yarn.resourcemanager.admin.address" : "%HOSTGROUP::blueprint2%:8141",
"yarn.resourcemanager.cluster-id" : "yarn-cluster",
"yarn.resourcemanager.ha.automatic-failover.zk-base-path" : "/yarn-leader-election",
"yarn.resourcemanager.ha.enabled" : "true",
"yarn.resourcemanager.ha.rm-ids" : "rm1,rm2",
"yarn.resourcemanager.hostname" : "%HOSTGROUP::blueprint2%",
"yarn.resourcemanager.hostname.rm1" : "%HOSTGROUP::blueprint2%",
"yarn.resourcemanager.hostname.rm2" : "%HOSTGROUP::blueprint3%",
"yarn.resourcemanager.webapp.address.rm1" : "%HOSTGROUP::blueprint2%:8088",
"yarn.resourcemanager.webapp.address.rm2" : "%HOSTGROUP::blueprint3%:8088",
"yarn.resourcemanager.recovery.enabled" : "true",
"yarn.resourcemanager.resource-tracker.address" : "%HOSTGROUP::blueprint2%:8025",
"yarn.resourcemanager.scheduler.address" : "%HOSTGROUP::blueprint2%:8030",
"yarn.resourcemanager.store.class" : "org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore",
"yarn.resourcemanager.webapp.address" : "%HOSTGROUP::blueprint2%:8088",
"yarn.resourcemanager.webapp.https.address" : "%HOSTGROUP::blueprint2%:8090",
"yarn.timeline-service.address" : "%HOSTGROUP::blueprint3%:10200",
"yarn.timeline-service.webapp.address" : "%HOSTGROUP::blueprint3%:8188",
"yarn.timeline-service.webapp.https.address" : "%HOSTGROUP::blueprint3%:8190"
}
}
}
],
"host_groups" : [
{
"name" : "blueprint1",
"components" : [
{
"name" : "NAMENODE"
},
{
"name" : "NODEMANAGER"
},
{
"name" : "DATANODE"
},
{
"name" : "ZOOKEEPER_CLIENT"
},
{
"name" : "HDFS_CLIENT"
},
{
"name" : "YARN_CLIENT"
},
{
"name" : "MAPREDUCE2_CLIENT"
},
{
"name" : "ZOOKEEPER_SERVER"
}
],
"cardinality" : 1
},
{
"name" : "blueprint2",
"components" : [
{
"name" : "SECONDARY_NAMENODE"
},
{
"name" : "RESOURCEMANAGER"
},
{
"name" : "NODEMANAGER"
},
{
"name" : "DATANODE"
},
{
"name" : "ZOOKEEPER_CLIENT"
},
{
"name" : "ZOOKEEPER_SERVER"
},
{
"name" : "HDFS_CLIENT"
},
{
"name" : "YARN_CLIENT"
},
{
"name" : "MAPREDUCE2_CLIENT"
}
],
"cardinality" : 1
},
{
"name" : "blueprint3",
"components" : [
{
"name" : "RESOURCEMANAGER"
},
{
"name" : "APP_TIMELINE_SERVER"
},
{
"name" : "HISTORYSERVER"
},
{
"name" : "NODEMANAGER"
},
{
"name" : "DATANODE"
},
{
"name" : "ZOOKEEPER_CLIENT"
},
{
"name" : "ZOOKEEPER_SERVER"
},
{
"name" : "HDFS_CLIENT"
},
{
"name" : "YARN_CLIENT"
},
{
"name" : "MAPREDUCE2_CLIENT"
}
],
"cardinality" : 1
}
],
"Blueprints" : {
"blueprint_name" : "hdptest",
"stack_name" : "HDP",
"stack_version" : "2.5"
}
}
Note - I have kept Resource Managers on blueprint1 and blueprint2, you can change it according to your requirement. . Step 4: Create an internal repository map . 4.1: hdp repository – copy below contents, modify base_url to add hostname/ip-address of your internal repository server and save it in repo.json file. {
"Repositories":{
"base_url":"http://<ip-address-of-repo-server>/hdp/centos6/HDP-2.5.3.0",
"verify_base_url":true
}
} . 4.2: hdp-utils repository – copy below contents, modify base_url to add hostname/ip-address of your internal repository server and save it in hdputils-repo.json file. {
"Repositories":{
"base_url":"http://<ip-address-of-repo-server>/hdp/centos6/HDP-UTILS-1.1.0.21",
"verify_base_url":true
}
} . Step 5: Register blueprint with ambari server by executing below command curl -H "X-Requested-By: ambari"-X POST -u admin:admin http://<ambari-server-hostname>:8080/api/v1/blueprints/multinode-hdp -d @cluster_config.json . Step 6: Setup Internal repo via REST API. Execute below curl calls to setup internal repositories. curl -H "X-Requested-By: ambari"-X PUT -u admin:admin http://<ambari-server-hostname>:8080/api/v1/stacks/HDP/versions/2.4/operating_systems/redhat6/repositories/HDP-2.4 -d @repo.json
curl -H "X-Requested-By: ambari"-X PUT -u admin:admin http://<ambari-server-hostname>:8080/api/v1/stacks/HDP/versions/2.4/operating_systems/redhat6/repositories/HDP-UTILS-1.1.0.20 -d @hdputils-repo.json . Step 7: Pull the trigger! Below command will start cluster installation. curl -H "X-Requested-By: ambari" -X POST -u admin:admin http://<ambari-server-hostname>:8080/api/v1/clusters/multinode-hdp -d @hostmap.json . Please feel free to comment if you need any further help on this. Happy Hadooping!!
... View more
Labels:
02-21-2017
05:26 PM
SYMPTOM: Oozie sqoop action fails with below error while inserting data into Hive. 20217 [Thread-30] INFO org.apache.sqoop.hive.HiveImport - Sorry ! hive-shell is disabled use 'Beeline' or 'Hive View' instead. Please contact cluster administrators for further information
20218 [main] ERROR org.apache.sqoop.tool.ImportTool - Encountered IOException running import job: java.io.IOException: Hive exited with status 1
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:389)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:342)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:246)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:524)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615)
at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:243)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:298)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.main(Sqoop.java:243)
at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:202)
at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:182)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:51)
at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:48)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:242)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) . ROOT CAUSE: Sqoop uses CliDriver class and does not use hive script whereas Oozie was not able to find that class in classpath hence it was trying to use hive cli. . WORKAROUND: N/A . RESOLUTION: Add below property in job.properties file and re-run failed Oozie workflow. oozie.action.sharelib.for.sqoop=sqoop,hive
... View more
Labels:
02-07-2017
07:08 PM
PROBLEM: Ambari Server won't be able to start because of DB inconsistencies. Sample Error: 2017-02-06 05:08:43,975 ERROR - You have non selected configs: zeppelin-ambari-config for service ZEPPELIN from cluster XXXX!
2017-02-06 05:08:43,976 INFO - ******************************* Check database completed *******************************
2017-02-06 05:10:12,834 INFO - Checking DB store version
2017-02-06 05:10:14,094 INFO - DB store version is compatible
2017-02-07 13:50:31,769 INFO - ******************************* Check database started *******************************
2017-02-07 13:50:41,247 INFO - Checking for configs not mapped to any cluster
2017-02-07 13:50:41,322 INFO - Checking for configs selected more than once
2017-02-07 13:50:41,326 INFO - Checking for hosts without state
2017-02-07 13:50:41,330 INFO - Checking host component states count equals host component desired states count
2017-02-07 13:50:41,334 INFO - Checking services and their configs
2017-02-07 13:50:45,793 INFO - Processing HDP-2.5 / SQOOP
2017-02-07 13:50:45,793 INFO - Processing HDP-2.5 / HDFS
2017-02-07 13:50:45,793 INFO - Processing HDP-2.5 / MAPREDUCE2
2017-02-07 13:50:45,793 INFO - Processing HDP-2.5 / TEZ
2017-02-07 13:50:45,793 INFO - Processing HDP-2.5 / SPARK
2017-02-07 13:50:45,793 INFO - Processing HDP-2.5 / HBASE
2017-02-07 13:50:45,793 INFO - Processing HDP-2.5 / ZOOKEEPER
2017-02-07 13:50:45,793 INFO - Processing HDP-2.5 / YARN
2017-02-07 13:50:45,793 INFO - Processing HDP-2.5 / KNOX
2017-02-07 13:50:45,794 INFO - Processing HDP-2.5 / PIG
2017-02-07 13:50:45,794 INFO - Processing HDP-2.5 / RANGER
2017-02-07 13:50:45,794 INFO - Processing HDP-2.5 / HIVE
2017-02-07 13:50:45,794 INFO - Processing HDP-2.5 / SLIDER
2017-02-07 13:50:45,794 INFO - Processing HDP-2.5 / AMBARI_INFRA
2017-02-07 13:50:45,794 INFO - Processing HDP-2.5 / KAFKA
2017-02-07 13:50:45,794 INFO - Processing HDP-2.5 / SMARTSENSE
2017-02-07 13:50:45,809 ERROR - You have non selected configs: zeppelin-ambari-config for service ZEPPELIN from cluster XXXXX!
2017-02-07 13:50:45,810 INFO - ******************************* Check database completed ******************************* . BUSINESS IMPACT: It's not recommended to make any changes in service configurations because backend Database is not consistent. . WORKAROUND: ambari-server start --skip-database-check Note - This is not recommended for production clusters, if you do this, please do not make any modifications in service configurations till you resolves the conflicts. . RESOLUTION: 1. Stop Ambari
server ambari-server stop . 2. Take backup of
Ambari Database For postgres - Use pg_dump command. For MySql - Use mysqldump command. . 3. Run below queries
to resolve conflicts delete from hostcomponentstate where service_name = 'ZEPPELIN';
delete from hostcomponentdesiredstate where service_name = 'ZEPPELIN';
delete from servicecomponentdesiredstate where service_name = 'ZEPPELIN';
delete from servicedesiredstate where service_name = 'ZEPPELIN';
delete from serviceconfighosts where service_config_id in (select service_config_id from serviceconfig where service_name = 'ZEPPELIN');
delete from serviceconfigmapping where service_config_id in (select service_config_id from serviceconfig where service_name = 'ZEPPELIN');
delete from serviceconfig where service_name = 'ZEPPELIN';
delete from requestresourcefilter where service_name = 'ZEPPELIN';
delete from requestoperationlevel where service_name = 'ZEPPELIN';
delete from clusterservices where service_name ='ZEPPELIN';
delete from clusterconfig where type_name like 'zeppelin%';
delete from clusterconfigmapping where type_name like 'zeppelin%'; . 4. Start Ambari Server and it should come up without any inconsistencies. . Please feel free to comment if you need any further help on this. Happy Hadooping!!
... View more
Labels: