Member since
06-03-2019
59
Posts
21
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1106 | 04-11-2023 07:41 AM | |
7617 | 02-14-2017 10:34 PM | |
1356 | 02-14-2017 05:31 AM |
04-11-2023
07:41 AM
1 Kudo
Follow these docs https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/migrating-data-into-hive/topics/hive_moving_data_from_databases_to_hive.html Command example:- Import from SQL Server to Hadoop sqoop import -Dmapreduce.job.queuename=<yourqueur> --connect "jdbc:sqlserver://<sqlserverhost>:<sqlserver port>;database=<sqlserver database>;username=<sqlserver user>;password=<sqlserver password>" --table "<sqlserver table>" -- --schema <sqlserver schema> -m 1 --hive-import --hive-database "<hive database>" --hive-table "<hive table>" --hive-overwrite --direct
... View more
09-21-2018
06:11 PM
This article explains how to delete a Registered HDP cluster from DPS Steps: 1. Download the attached file dp-cluster-remove.txt and store it in postgres db server. 2. psql -f ./dp_cluster_remove.txt -d dataplane -v cluster_name=c149 -d --> <dataplane database name> cluster_name --> Your HDP clustername that should be removed from DPS cluster. Example Output:- [root@rraman bin]# docker exec -it dp-database /bin/bash bash-4.3# su - postgres de0ff40ad912:~$ ls -lrt total 8 drwx------ 19 postgres postgres 4096 Sep 21 01:27 data
-rwxrwxrwx 1 postgres postgres 892 Sep 21 17:59 dp_cluster_remove.txt de0ff40ad912:~$ psql -f ./dp_cluster_remove.txt -d dataplane -v cluster_name=c149
CREATE FUNCTION
remove_cluster_from_dps -------------------------
(1 row)
DROP FUNCTION
... View more
Labels:
06-13-2018
04:26 PM
2 Kudos
In DPS-1.1.0 we can't edit all the LDAP configuration properties after the initial setup. If we have to correct LDAP configs, you need to re-initialize the DPS setup to change it, which can be a painful task. To avoid re-initializing the DPS setup, we can make the changes directly in the postgres database of dataplane. Step1:- Find the container id of dp-database on DPS machine docker ps Step2: - connect to the docker machine docker exec -it cf3f4a31e146 /bin/bash Step3:- Login to the postgres database (dataplane) su - postgres psql -d dataplane Take backup of the table, create table dataplane.ldap_configs_bkp as select * from dataplane.ldap_configs; To view the existing configuration: select * from dataplane.ldap_configs; Sample Output: id | url | bind_dn | user_searchbase | usersearch_attributename | group_searchbase | groupsearch_attr
ibutename | group_objectclass | groupmember_attributename | user_object_class
----+---------------------------------------+----------------------------------------------------+-----------------------------------------+--------------------------+------------------------------------------+-----------------
----------+-------------------+---------------------------+-------------------
1 | ldap://ldap.hortonworks.com:389 | uid=xyz,ou=users,dc=support,dc=hortonworks,dc=com | ou=users,dc=support,dc=hortonworks,dc=com | uid | ou=groups,dc=support,dc=hortonworks,dc=com | cn
| posixGroup | memberUid | posixAccount Step 4:- Make the changes in database for the required field For example:- if i need to change usersearch_attributename from uid to cn, then i can issue this command update dataplane.ldap_configs set usersearch_attributename='cn'; That's it! it should reflect immediately on the dataplane UI. Note:- Use this doc, when you are newly installing DPS and made a mistake in LDAP configs.
... View more
Labels:
04-17-2018
03:36 PM
Just a caution note:- When you are trying to set "server.jdbc.properties.loglevel=2", it adds additional database query logging and it should be enabled only when we troubleshoot performance issues. Advise to remove this property once the troubleshooting is over. We have seen 2x to 3x Ambari performance degradation when it is enabled.
... View more
02-02-2018
10:53 PM
1 Kudo
To print GC details, please add the following line in Spark--> config--> Advanced spark-env --> spark-env template and restart spark history server. export SPARK_DAEMON_JAVA_OPTS=" -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:{{spark_log_dir}}/spark_history_server.gc.`date +'%Y%m%d%H%M'`"
... View more
Labels:
01-03-2018
07:25 PM
Command to delete a particular solr collection:- Example:- curl "http://<SOLR-HOSTNAME>:8886/solr/audit_logs/update?commit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>evtTime:[* TO NOW-7DAYS]</query></delete>" In this example, audit_logs - is the collection name. TO NOW-7DAYS - Will delete the data older than 7 days.
... View more
Labels:
01-03-2018
07:22 PM
Hive metastore DB Connection verification from Command line:- You can run the following on any node where ambari agent installed on the cluster, You need the following information to run this test. You can use this to validate if the password stored in ambari and the actual mysql db password are same. 1. mysql db hostname 2. mysql db port number 3. mysql database name which hive metastore uses 4. mysql username 5. mysql password Syntax:- java -cp /usr/lib/ambari-agent/DBConnectionVerification.jar:/usr/share/java/mysql-connector-java.jar -Djava.library.path=/usr/lib/ambari-agent org.apache.ambari.server.DBConnectionVerification "jdbc:mysql://<mysql db hostname>:<mysql db port number>/<mysql database name>" "<mysql username>" "<mysql password>" com.mysql.jdbc.Driver Example:- /usr/jdk64/jdk1.8.0_112/bin/java -cp /usr/lib/ambari-agent/DBConnectionVerification.jar:/usr/share/java/mysql-connector-java.jar -Djava.library.path=/usr/lib/ambari-agent org.apache.ambari.server.DBConnectionVerification "jdbc:mysql://test.openstacklocal:50001/hive" hive hive com.mysql.jdbc.Driver Connected to DB Successfully!
... View more
Labels:
09-21-2017
08:51 PM
@Anwaar Siddiqui Great it works for you. Please accept the answer. You can't just upgrade one component in the stack. You have to consider moving to HDP-2.6.X latest version which has Knox 0.12. Hope this helps.
... View more
09-20-2017
02:17 PM
Khaja Hussain You haven't mentioned time the workload timings when they run separately. For example, Can you run Jan 2017 data processing separately and record the number of containers it requires and the time it takes (10 min) The same way repeat for Feb 2017 data processing separately(12 min). Compare the number of containers each job demands. Then set the order policy as FAIR and set Minimum user limit to 50% for the queue, and run both the jobs parallely. Now the resource allocation should be equally distributed and observe the increased run time for the jobs as the number of containers available for each job were reduced. Also please refer the article on user-limit-factor in a yarn queue, https://community.hortonworks.com/content/supportkb/49640/what-does-the-user-limit-factor-do-when-used-in-ya.html Accept the answer if this helps for your query.
... View more
09-20-2017
02:42 AM
@Khaja Hussain Please find the attached Yarn Queue manager screen snap for example. 1. You can control the resources by having separate queues for different applications with that you can control resources spent at the queue. 2. Within a Single queue , you can mention the "Minimum User limit %" to control a single user resource %. Also you can choose ordering policy as "Fair" instead of "FIFO"(First in first out) 3. You can also control maximum % of resources a single queue can take out of the total cluster capacity. Hope the attached screen snap helps. yarn-queue-config-to-control-min-max.jpg
... View more