Member since
06-03-2019
49
Posts
20
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2815 | 02-14-2017 10:34 PM | |
719 | 02-14-2017 05:31 AM |
01-16-2019
11:11 PM
@George Thomas Your question too generic, you haven't mentioned for which component you want to do benchmarking. For mapreduce - you can use the hadoop-mapreduce-client-jobclient-tests.jar For hive benchmarking - You could use the following github code. https://github.com/hortonworks/hive-testbench/tree/hdp26 For hbase:- you could use the default hbase pe utility. time hbase org.apache.hadoop.hbase.PerformanceEvaluation -nomapred randomWrite 10 or utility like "ycsb"
... View more
09-21-2018
06:11 PM
This article explains how to delete a Registered HDP cluster from DPS Steps: 1. Download the attached file dp-cluster-remove.txt and store it in postgres db server. 2. psql -f ./dp_cluster_remove.txt -d dataplane -v cluster_name=c149 -d --> <dataplane database name> cluster_name --> Your HDP clustername that should be removed from DPS cluster. Example Output:- [root@rraman bin]# docker exec -it dp-database /bin/bash bash-4.3# su - postgres de0ff40ad912:~$ ls -lrt total 8 drwx------ 19 postgres postgres 4096 Sep 21 01:27 data
-rwxrwxrwx 1 postgres postgres 892 Sep 21 17:59 dp_cluster_remove.txt de0ff40ad912:~$ psql -f ./dp_cluster_remove.txt -d dataplane -v cluster_name=c149
CREATE FUNCTION
remove_cluster_from_dps -------------------------
(1 row)
DROP FUNCTION
... View more
- Find more articles tagged with:
- dps
- hdp-2.6.5
- How-ToTutorial
Labels:
06-13-2018
04:26 PM
2 Kudos
In DPS-1.1.0 we can't edit all the LDAP configuration properties after the initial setup. If we have to correct LDAP configs, you need to re-initialize the DPS setup to change it, which can be a painful task. To avoid re-initializing the DPS setup, we can make the changes directly in the postgres database of dataplane. Step1:- Find the container id of dp-database on DPS machine docker ps Step2: - connect to the docker machine docker exec -it cf3f4a31e146 /bin/bash Step3:- Login to the postgres database (dataplane) su - postgres psql -d dataplane Take backup of the table, create table dataplane.ldap_configs_bkp as select * from dataplane.ldap_configs; To view the existing configuration: select * from dataplane.ldap_configs; Sample Output: id | url | bind_dn | user_searchbase | usersearch_attributename | group_searchbase | groupsearch_attr
ibutename | group_objectclass | groupmember_attributename | user_object_class
----+---------------------------------------+----------------------------------------------------+-----------------------------------------+--------------------------+------------------------------------------+-----------------
----------+-------------------+---------------------------+-------------------
1 | ldap://ldap.hortonworks.com:389 | uid=xyz,ou=users,dc=support,dc=hortonworks,dc=com | ou=users,dc=support,dc=hortonworks,dc=com | uid | ou=groups,dc=support,dc=hortonworks,dc=com | cn
| posixGroup | memberUid | posixAccount Step 4:- Make the changes in database for the required field For example:- if i need to change usersearch_attributename from uid to cn, then i can issue this command update dataplane.ldap_configs set usersearch_attributename='cn'; That's it! it should reflect immediately on the dataplane UI. Note:- Use this doc, when you are newly installing DPS and made a mistake in LDAP configs.
... View more
- Find more articles tagged with:
- Cloud & Operations
- dps
- How-ToTutorial
Labels:
04-17-2018
03:36 PM
Just a caution note:- When you are trying to set "server.jdbc.properties.loglevel=2", it adds additional database query logging and it should be enabled only when we troubleshoot performance issues. Advise to remove this property once the troubleshooting is over. We have seen 2x to 3x Ambari performance degradation when it is enabled.
... View more
02-02-2018
10:53 PM
1 Kudo
To print GC details, please add the following line in Spark--> config--> Advanced spark-env --> spark-env template and restart spark history server. export SPARK_DAEMON_JAVA_OPTS=" -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:{{spark_log_dir}}/spark_history_server.gc.`date +'%Y%m%d%H%M'`"
... View more
- Find more articles tagged with:
- Data Science & Advanced Analytics
- FAQ
- historyserver
- Spark
Labels:
01-03-2018
07:25 PM
Command to delete a particular solr collection:- Example:- curl "http://<SOLR-HOSTNAME>:8886/solr/audit_logs/update?commit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>evtTime:[* TO NOW-7DAYS]</query></delete>" In this example, audit_logs - is the collection name. TO NOW-7DAYS - Will delete the data older than 7 days.
... View more
- Find more articles tagged with:
- collection
- delete
- FAQ
- Hadoop Core
- solr
Labels:
01-03-2018
07:22 PM
Hive metastore DB Connection verification from Command line:- You can run the following on any node where ambari agent installed on the cluster, You need the following information to run this test. You can use this to validate if the password stored in ambari and the actual mysql db password are same. 1. mysql db hostname 2. mysql db port number 3. mysql database name which hive metastore uses 4. mysql username 5. mysql password Syntax:- java -cp /usr/lib/ambari-agent/DBConnectionVerification.jar:/usr/share/java/mysql-connector-java.jar -Djava.library.path=/usr/lib/ambari-agent org.apache.ambari.server.DBConnectionVerification "jdbc:mysql://<mysql db hostname>:<mysql db port number>/<mysql database name>" "<mysql username>" "<mysql password>" com.mysql.jdbc.Driver Example:- /usr/jdk64/jdk1.8.0_112/bin/java -cp /usr/lib/ambari-agent/DBConnectionVerification.jar:/usr/share/java/mysql-connector-java.jar -Djava.library.path=/usr/lib/ambari-agent org.apache.ambari.server.DBConnectionVerification "jdbc:mysql://test.openstacklocal:50001/hive" hive hive com.mysql.jdbc.Driver Connected to DB Successfully!
... View more
- Find more articles tagged with:
- Hadoop Core
- Hive
- How-ToTutorial
- Metastore
Labels:
09-21-2017
08:51 PM
@Anwaar Siddiqui Great it works for you. Please accept the answer. You can't just upgrade one component in the stack. You have to consider moving to HDP-2.6.X latest version which has Knox 0.12. Hope this helps.
... View more
09-20-2017
02:17 PM
Khaja Hussain You haven't mentioned time the workload timings when they run separately. For example, Can you run Jan 2017 data processing separately and record the number of containers it requires and the time it takes (10 min) The same way repeat for Feb 2017 data processing separately(12 min). Compare the number of containers each job demands. Then set the order policy as FAIR and set Minimum user limit to 50% for the queue, and run both the jobs parallely. Now the resource allocation should be equally distributed and observe the increased run time for the jobs as the number of containers available for each job were reduced. Also please refer the article on user-limit-factor in a yarn queue, https://community.hortonworks.com/content/supportkb/49640/what-does-the-user-limit-factor-do-when-used-in-ya.html Accept the answer if this helps for your query.
... View more
09-20-2017
02:42 AM
@Khaja Hussain Please find the attached Yarn Queue manager screen snap for example. 1. You can control the resources by having separate queues for different applications with that you can control resources spent at the queue. 2. Within a Single queue , you can mention the "Minimum User limit %" to control a single user resource %. Also you can choose ordering policy as "Fair" instead of "FIFO"(First in first out) 3. You can also control maximum % of resources a single queue can take out of the total cluster capacity. Hope the attached screen snap helps. yarn-queue-config-to-control-min-max.jpg
... View more
09-20-2017
02:23 AM
@Anwaar Siddiqui It seems to be a knox bug https://issues.apache.org/jira/browse/KNOX-890 Workaround:- Append "http.header.Connection=close" in the JDBC connection string. For example, with Beeline, use the following command:
beeline -u "jdbc:hive2://sandbox.hortonworks.com:8443/
;ssl=false;sslTrustStore=/tmp/myNewTrustStore.jks;trustStorePassword=changeit;transportMode
=http;httpPath=gateway/default/hive;http.header.Connection=close" -n admin -p admin-password
... View more
09-14-2017
02:59 AM
@Angel Mondragon Can you see if there is any PID value on this file /var/run/mysqld/mysqld.pid Try to grep the process if there is any dead process hanging. if so try to kill it and remove the pid file. Also you can delete the socket file /var/lib/mysql/mysql.sock and /tmp/mysql.sock.lock & /tmp/mysql.sock. During the mysql restart these files will be recreated.
... View more
09-14-2017
02:42 AM
Aman Verma 1. You are missing "phoenix.table.name" in your hive external table create command. 2. Also "phoenix.zookeeper.znode.parent" =/hbase-unsecure is missing "/" before hbase-unsecure. 3. Add the following before hive external table creation in hive cli / beeline; add jar /usr/hdp/current/phoenix-client/phoenix-client.jar; add jar /usr/hdp/current/phoenix-client/phoenix-hive.jar; add jar /usr/hdp/current/phoenix-client/phoenix-server.jar; add jar /usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.5.0.50-18.jar; Example syntax for Hive external table for the phoenix schema table "test_phoenix.test_table":- CREATE external TABLE test(
pk string,
a1 bigint,
a2 bigint) STORED BY
'org.apache.phoenix.hive.PhoenixStorageHandler'
WITH SERDEPROPERTIES (
'serialization.format'='1') TBLPROPERTIES (
'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}', 'phoenix.column.mapping'="pk:PK,a1:A1,a2:A2", 'phoenix.rowkeys'='pk',
'phoenix.table.name'='test_phoenix.test_table',
'phoenix.zookeeper.client.port'='2181',
'phoenix.zookeeper.quorum'='node1',
'phoenix.zookeeper.znode.parent'='/hbase-unsecure'
);
... View more
09-14-2017
01:55 AM
Anitha R Hotfix can only be provided by Hortonworks support. If you have support, please create a case and request for Hotfix patch for this bug. Recommended patching process is to apply the Hotfix patch with ambari upgrade. It will apply patch only the Zeepline binaries. If that is not feasible, request support if JAR patch replacement an option for this bug.
... View more
09-14-2017
01:46 AM
@Juan Manuel Nieto Please see if this article helps for your problem. https://community.hortonworks.com/questions/61159/getting-untrusted-proxy-message-while-trying-to-se.html
... View more
08-31-2017
04:36 AM
@Suhel Please see the following HCC article which has the similar symptoms. Solution is to upgrade your ambari version. https://community.hortonworks.com/content/supportkb/49602/dataxceiver-error-processing-unknown-operation-src.html
... View more
08-30-2017
07:28 PM
1 Kudo
@Aishwarya Dixit We can use jceks to secure the password and edit shiro.ini to include this jceks file. Ex: hadoop credential create contextFactory.systemPassword -provider jceks:///etc/zeppelin/conf/credentials.jceks Slide 25 in the following article has an example, https://www.slideshare.net/vinnies12/apache-spark-apache-zeppelin-security-for-enterprise-deployments See if this helps.
... View more
08-30-2017
03:19 PM
3 Kudos
@Freemon Johnson You can also use the hdp-select status command as root in cli. It will list the services installed on the cluster. example:- ( None- means it is available in HDP.repo for install but not installed) [root@micprod ~]# hdp-select status
accumulo-client - None
accumulo-gc - None
accumulo-master - None
accumulo-monitor - None
accumulo-tablet - None
accumulo-tracer - None
atlas-client - 2.5.3.51-3
atlas-server - 2.5.3.51-3
falcon-client - 2.5.3.51-3
falcon-server - 2.5.3.51-3
flume-server - 2.5.3.51-3
hadoop-client - 2.5.3.51-3
hadoop-hdfs-datanode - 2.5.3.51-3
hadoop-hdfs-journalnode - 2.5.3.51-3
hadoop-hdfs-namenode - 2.5.3.51-3
hadoop-hdfs-nfs3 - 2.5.3.51-3
hadoop-hdfs-portmap - 2.5.3.51-3
hadoop-hdfs-secondarynamenode - 2.5.3.51-3
hadoop-hdfs-zkfc - 2.5.3.51-3
hadoop-httpfs - None
hadoop-mapreduce-historyserver - 2.5.3.51-3
hadoop-yarn-nodemanager - 2.5.3.51-3
hadoop-yarn-resourcemanager - 2.5.3.51-3
hadoop-yarn-timelineserver - 2.5.3.51-3
hbase-client - 2.5.3.51-3
hbase-master - 2.5.3.51-3
hbase-regionserver - 2.5.3.51-3
hive-metastore - 2.5.3.51-3
hive-server2 - 2.5.3.51-3
hive-server2-hive2 - 2.5.3.51-3
hive-webhcat - 2.5.3.51-3
kafka-broker - 2.5.3.51-3
knox-server - 2.5.3.51-3
livy-server - 2.5.3.51-3
mahout-client - None
oozie-client - 2.5.3.51-3
oozie-server - 2.5.3.51-3
phoenix-client - 2.5.3.51-3
phoenix-server - 2.5.3.51-3
ranger-admin - 2.5.3.51-3
ranger-kms - None
ranger-tagsync - None
ranger-usersync - 2.5.3.51-3
slider-client - 2.5.3.51-3
spark-client - 2.5.3.51-3
spark-historyserver - 2.5.3.51-3
spark-thriftserver - 2.5.3.51-3
spark2-client - 2.5.3.51-3
spark2-historyserver - 2.5.3.51-3
spark2-thriftserver - 2.5.3.51-3
sqoop-client - 2.5.3.51-3
sqoop-server - 2.5.3.51-3
storm-client - None
storm-nimbus - None
storm-slider-client - 2.5.3.51-3
storm-supervisor - None
zeppelin-server - 2.5.3.51-3
zookeeper-client - 2.5.3.51-3
zookeeper-server - 2.5.3.51-3
... View more
08-28-2017
06:29 AM
3 Kudos
Ambari database cleanup - Speed up ambari-server db-cleanup -d 2016-09-30 --cluster-name=TESTHDP I have ran this on a ambari database used by on one of the 500 node cluster HDP cluster. It ran for more than 15 hours but without any success. I analyzed the ambari server logs to check where it is taking most of the time, it seems to be spending more time on batch deletes on ambari.alert_notice, ambari.alert_current ,ambari.alert_history table. To improve the performance of the db cleanup, I have created the index on the ambari.alert_notice table.. -bash-4.1$ psql -U ambari -d ambari
Password for user ambari:
psql (8.4.20)
Type "help" for help.
ambari=> CREATE INDEX alert_notice_idx ON ambari.alert_notice(history_id); After this i ran re-loaded my ambari database from the backup and ran the db-cleanup, it took only less than 2 min to complete the cleanup. Also to reclaim the disk space and reindex after the cleanup, i ran the following commands as super user "postgres" Vacuum full;
reindex database ambari; .
... View more
- Find more articles tagged with:
- Ambari
- ambari-db
- ambari-server
- Cloud & Operations
- FAQ
Labels:
08-02-2017
05:57 AM
3 Kudos
Caution: Running bad queries against the AMS hbase tables can crash the AMS collector PID due to load. Use it for debugging purpose. To connect to AMS Hbase instance which is running in distributed mode. cd /usr/lib/ambari-metrics-collector/bin ./sqlline.py localhost:2181:/ams-hbase-secure To get the correct znode, get the value of "zookeeper.znode.parent" from AMS collector configs.
... View more
- Find more articles tagged with:
- ams
- ams_hbase_env
- How-ToTutorial
- Sandbox & Learning
08-01-2017
02:20 AM
@Jerry Cutshaw / @Prashant Kumar From the error you posted, it seems to be a typo in the dependency you have mentioned. org.apache.phoenix:phoenix-core:jar:4.7.0.2.6.0.3-8 Correct one:- org.apache.phoenix:phoenix-core:4.7.0.2.6.0.3-8 Please try this and let me know the outcome. If it didn't work, upload the jdbc interpreter log and zeppelin log. Also upload the screenshot of the dependency part of JDBC interpreter. If this fixes your problem, accept as best answer.
... View more
07-28-2017
08:21 PM
@Prashant Kumar Can you try this, 1) go to Ambari UI -> Zeppelin -> Configs -> Advanced zeppelin-env -> zeppelin-env_content
- add
export ZEPPELIN_INTP_CLASSPATH_OVERRIDES="/etc/zeppelin/conf/external-dependency-conf"
just above
#### Spark interpreter configuration #### 2) in jdbc interpreter add the following dependencies: (use the correct phoenix version number)
org.apache.phoenix:phoenix-core:4.7.0.2.6.0.3-8 Get the phoenix-core Version number from the following location ls -ltr /usr/hdp/current/zeppelin-server/interpreter/jdbc/phoenix-core-4.7.*
... View more
07-28-2017
08:03 PM
@Girish Drabla Please check if the metastore is staying up and receiving connections. Also check if you are able to connect to Hive metastore database and run some select commands against hive postgres/mysql database.
... View more
07-28-2017
07:54 PM
@rajalaxmi rath The problem seems to be syntax error of the Oracle query. 1.) Can you do the following in your oracle database and provide the output. select TO_TIMESTAMP("01.01.1600:00:00","DD.MM.YYHH24:MI:SS") from dual; 2) What is the data type of the oracle column "RECORD_CREATED_DT" ? The good example of using TO_TIMESTAMP format is like the below one. I don't know if this format is acceptable in your NIFI processing. SELECT TO_TIMESTAMP ('10-Sep-02 14:10:10.123000', 'DD-Mon-RR HH24:MI:SS.FF')
FROM DUAL; 3) You can also try the following directly in oracle,
SELECT COL1, COL2 FROM DB1.TABLE1
WHERE TO_TIMESTAMP(RECORD_CREATED_DT ,"DD.MM.YY HH24:MI:SS") >= TO_TIMESTAMP("01.01.16 00:00:00","DD.MM.YY HH24:MI:SS");
... View more
07-19-2017
12:10 PM
1 Kudo
Run the following commands postgres user(super user) : To check the db total size:- (before and after comparision)
SELECT pg_size_pretty( pg_database_size('ambari')); vacuum full;
reindex database ambari;
SELECT pg_size_pretty( pg_database_size('ambari'));
... View more
05-25-2017
11:06 PM
@Mathi Murugan Can you see if the file is corrupted or not in hdfs. su - hdfs -c "hdfs fsck /backup/hbase/FullBackup/20170523/Recipients/part-m-00000" Is this happening only on this file when you retry ?
... View more
04-26-2017
04:02 PM
@mayki wogno can you tell your ambari version and see if there is any fonts issues in your machine where you are accessing ambari URL. Also check the user who runs the ambai-server. Try to enable the debug properites (log4j.rootLogger=DEBUG) in the file /etc/ambari-server/conf/log4j.properties and see if you could find anything in the ambari-server log. You can check your in browser ---> developer tools if there is any error when loading the page.
... View more
03-30-2017
06:18 AM
1 Kudo
By default the GC logs are not enabled for Hive components. It is good to enable them to troubleshoot GC pauses on hiveserver2 instances. --------------------------------- Hiveserver2 / Metastore: --------------------------------- In Ambari navigate to the following path Services -- > Hive -- > Configs -- > Advanced --> Advanced hive-env --> hive-env template Add the following lines at the beginning, if [[ "$SERVICE" == "hiveserver2" || "$SERVICE" == "metastore" ]]; then HIVE_SERVERS_GC_LOG_OPTS="-Xloggc:{{hive_log_dir}}/gc.log-$SERVICE-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps" export HADOOP_OPTS="$HADOOP_OPTS $HIVE_SERVERS_GC_LOG_OPTS" fi --------------------------------- Webhact : --------------------------------- In Ambari navigate to the following path Services -- > Hive -- > Configs -- > Advanced --> Advanced webhcat-env --> webhcat-env template Add the following lines at the bottom, WEBHACAT_GC_LOG_OPTS="-Xloggc:{{templeton_log_dir}}/gc.log-webhcat-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps" export HADOOP_OPTS="$HADOOP_OPTS $WEBHACAT_GC_LOG_OPTS" Save the changes in ambari and restart the hive services , it will enable the GC logs writing at the restart. Thanks for the following articles. I changed the GC file name similar to namenode GC logs and kept all the GC variables in single parameter for simplicity. https://community.hortonworks.com/content/supportkb/49404/how-to-setup-gc-log-for-hiveserver2.html http://stackoverflow.com/questions/39888681/how-to-enable-gc-logging-for-apache-hiveserver2-metastore-server-webhcat-server?newreg=e73d605b7873494e810537edd040dcac
... View more
- Find more articles tagged with:
- garbage-collector
- Hive
- hiveserver2
- How-ToTutorial
- logs
- Sandbox & Learning
- Upgrade to HDP 2.5.3 : ConcurrentModificationException When Executing Insert Overwrite : Hive
Labels:
03-27-2017
10:13 PM
@Sergey SoldatovThanks for the response. Do you aware if there is any apache JIRA open to include this feature in future releases.
... View more