Member since
04-24-2017
82
Posts
11
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
489 | 01-20-2020 03:17 AM |
01-20-2020
03:17 AM
@ChineduLB What is your exact Query? You can write count Queries SQL for Hive table. In general you can refer below articles: https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/performance-tuning/content/hive_prepare_to_tune_performance.html https://www.qubole.com/blog/5-tips-for-efficient-hive-queries/ Thanks, Tamil Selvan K
... View more
01-20-2020
03:13 AM
@mike_bronson7 Please refer below PDF https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/cluster-planning/cluster-planning.pdf https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.0/bk_cluster-planning/content/ch_hardware-recommendations_chapter.html Thanks, Tamil Selvan K
... View more
01-08-2020
07:52 AM
1 Kudo
1. use Ranger Auditing for Hive to check the Query details run by a user. Hive does not store this detail in metastore. https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/audit-ref/content/managing_auditing_in_ranger_access.html 2. You can use the below Query To get all the apps having states as FINISHED,KILLED by the specific user for specific time period GET "http://Resource-Manager-Address:8088/ws/v1/cluster/apps?limit=20&states=FINISHED,KILLED&user=<user-id>&startedTimeBegin={time in epoch}&startedTimeEnd={time in epoch}" 3. Simply make use of Tez view if your execution Engine is Tez
... View more
12-26-2019
06:54 AM
There are few BI tools which can be used as a layer on top of Hive which can make use of pk constraints. AFAIK Hive does not check/validate pk constraints https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/using-hiveql/content/hive_constraints.html It can be added in table definition so that BI tools can make efficient query decisions and perform checks on key constraints.
... View more
12-26-2019
06:39 AM
Can you try below: beeline --showHeader=true --outputformat=csv2 -u "JDBC_URL" -e "query" > /tmp/output.csv
... View more
07-12-2018
01:26 AM
@Liam De Lee Did you get any solution for this issue? Thanks.
... View more
06-11-2018
02:11 PM
@wiljan As you are using Ranger, you can create policies for Databases and Table in Hive. Refer:https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_security/content/hive_policy.html There is also an older way to control the permissions using SQL Based Authorization where the user who creates the Table will be the owner of it. Refer: https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization
... View more
02-07-2018
06:45 PM
@SMACH H You can follow the below: 1. lock down the location in HDFS: set permission 700 to /apps/hive/warehouse 2. add policy to Ranger/Hive for database: *, allowing users to create databases. (note that the ambari-qa user also needs access to database: * to complete the service check) 3. Allow access to individual databases via Ranger/Hive policies. This blog post may be of interest: http://hortonworks.com/blog/best-practices-for-hive-authorization-using-apache-ranger-in-hdp-2-2/ Also you may explore the options with "hive.server2.enable.doAs"
... View more
01-15-2018
07:04 PM
5 Kudos
We can configure Hive View 2.0 to view Ranger Policies associated with any Table. It will not be configured by default (as of HDP 2.6.3) and we need to follow the below steps: 1. Go to Ambari Dasboard -> admin -> Manage Ambari 2. In the Manage Ambari page under Views section, expand Hive and then click on Hive View 2.0 3. Under the setting section, please provide the details as below: - Ranger Service Name: Go to Ranger Admin UI and check the Ranger Service Name which is configured for Hive (shown as below). Here it is tsk_hive - Provide the Ranger Admin username and password as you have configured - WebHDFS Authentication for kerberos enabled cluster: auth=KERBEROS;proxyuser=<proxyuser> Provide the ambari user principal name in place of <proxyuser> 4. After adding the above details, click on "Save". Leave the other settings as default. Additional Steps (only when you are not using local cluster): - If you are using the Local Cluster, then it is not required to change any parameters in "cluster configuration" section. - If you are using custom settings, then please provide the details as below: Once you provide the details, follow the below steps: 5. Go to Hive View 2.0 -> Tables -> click on the Table for which you want to view the Ranger Policies 6. Then click on "Authorization" Tab to view the Ranger Policies as below:
... View more
- Find more articles tagged with:
- ambari-views
- hive-views
- How-ToTutorial
- policies
- ranger-hive-plugin
- Sandbox & Learning
Labels:
12-23-2017
07:36 PM
4 Kudos
Beeline is a JDBC client tool which is used to connect to HiveServer2 or HiveServer2Interactive(LLAP). Beeline, which connects to HiveServer2 and requires access to only one .jar file: hive-jdbc-<version>-standalone.jar
Hortonworks recommends using HiveServer2 and a JDBC client (such as Beeline) as the primary way to access Hive. This approach uses SQL standard-based authorization or Ranger-based authorization. However, some users may wish to access Hive data from other applications, such as Pig. For these use cases, use the Hive CLI and storage-based authorization.
Connecting to Hive with Beeline
The following examples demonstrate how to use Beeline to connect to Hive for all possible variations of these modes.
Embedded Client
Use the following syntax to connect to Hive from Beeline in embedded mode:
!connect jdbc:hive2://
Remote Client with HiveServer2 TCP Transport Mode and SASL Authentication
Use the following syntax to connect to HiveServer2 in TCP mode from a remote Beeline client:
!connect jdbc:hive2://<hiveserver2host>:<port>/<db>
The default port for HiveServer2 in TCP mode is 10000 , and db is the name of the database to which you want to connect.
Connecting to HiveServer2 using Zookeeper Discovery Mode
Note: You can also find the below HiveServer2 JDBC URL from Ambari from Hive -> Summary page. To copy the JDBC URL, you can use the copy to clipboard icon next to the URL Observe the / after the <ZOOKEEPER QUORUM> in each string below - HiveServer2 using Binary Transport Mode: jdbc:hive2://<ZOOKEEPER QUORUM>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2HiveServer2 - HiveServer2 using HTTP Transport Mode: jdbc:hive2://<ZOOKEEPER QUORUM>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;transportMode=http;httpPath=cliservice
Connection String for connection to HiveServer2Interactive
Look for zooKeeperNamespace=hiveserver2-hive2 in the below URL
jdbc:hive2://<ZOOKEEPER QUORUM>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-hive2
For running Hive Scripts using Beeline, you can make use of -f option in Beeline.
beeline -u "jdbc:hive2://master01:2181,master02:2181,master03:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2" -f file.hql
Reference article: https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients
Please refer above article for Beeline command options and examples for getting started with Beeline.
... View more
- Find more articles tagged with:
- beeline
- Hive
- hiveserver2
- How-ToTutorial
- llap
- Sandbox & Learning
Labels:
12-08-2017
11:18 PM
@Ben Green Is Kerberos enabled cluster? Also make sure you have the directory for the user under HDFS in /user. For eg: If you are logged into ambari as root, then make sure hadoop fs -mkdir /user/root hadoop fs -chmod -R 755/user/root hadoop fs -chown -R root:root /user/root Let me know if you still face the issue.
... View more
12-08-2017
11:18 PM
@Ben Green Is Kerberos enabled cluster? Also make sure you have the directory for the user under HDFS in /user. For eg: If you are logged into ambari as root, then make sure {code:xml} hadoop fs -mkdir /user/root hadoop fs -chmod -R 755/user/root hadoop fs -chown -R root:root /user/root {code} Let me know if you still face the issue.
... View more
12-08-2017
11:10 PM
@Daniel Müller Creation of Tez Session does not depend on the connection type (JDBC or ODBC). Below are the properties which you can refer: 1. tez.session.am.dag.submit.timeout.secs Int value. Time (in seconds) for which the Tez AM should wait for a DAG to be submitted before shutting down. Only relevant in session mode. You can try to increase this values if you do not wish a new Tez session You can also read below document https://tez.apache.org/releases/0.8.2/tez-api-javadocs/configs/TezConfiguration.html Let me know if this helps?
... View more
12-08-2017
11:08 PM
@Daniel Müller Creation of Tez Session does not depend on the connection type (JDBC or ODBC). Below are the properties which you can refer: 1. tez.session.am.dag.submit.timeout.secs Int value. Time (in seconds) for which the Tez AM should wait for a DAG to be submitted before shutting down. Only relevant in session mode. You can also read below document https://tez.apache.org/releases/0.8.2/tez-api-javadocs/configs/TezConfiguration.html Let me know if this helps?
... View more
12-08-2017
11:02 PM
@Gopi Sharma Number of containers each query will use is defined here (https://cwiki.apache.org/confluence/display/TEZ/How+initial+task+parallelism+works), which consider number of resource available on current queue, the number of resource available in a queue is defined by the minimum guaranteed capacity (yarn.scheduler.capacity.root._queuename_.capacity) and not maximum capacity (yarn.scheduler.capacity.root._queuename_.maximum-capacity). You can read this https://community.hortonworks.com/articles/56636/hive-understanding-concurrent-sessions-queue-alloc.html
... View more
11-05-2017
01:33 AM
@Sedat Kestepe Please refer https://community.hortonworks.com/questions/55051/no-hive-21-in-hdp-25.html I hope that Carter's response in the above articles addresses your query.
... View more
10-29-2017
05:37 PM
@Warius Unnlauf Yes, Tez is a Exectuion Engine. I would recommend you to go through the below links which has the answers to most your queries. https://community.hortonworks.com/questions/83394/difference-between-mr-and-tez.html https://hortonworks.com/blog/introducing-tez-faster-hadoop-processing/ https://www.slideshare.net/Hadoop_Summit/w-235phall1pandey Thanks, Tamil.
... View more
10-01-2017
06:53 AM
@Abizer A There is no documents for Resource Manager Heap. RM store some application states to render the UI which is controlled by yarn.resourcemanager.max-completed-applications, the default value for this is 10000, so at any time RM need somewhere ~1G memory to store these applications in memory. You can set this around 4GB in your cluster which should be enough to store the Job status.
... View more
09-30-2017
06:51 AM
@Bhavesh Bhatt Though I have not tried using Zeppelin in HDP 2.3.2 version, I thought that below docs would be helpful for you. Apache Zeppelin was available as Tech preview in HDP 2.3.2. Please check below link for the official apache versions: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_HDP_RelNotes/content/ch_relnotes_v232.html Referring the link https://hortonworks.com/hadoop-tutorial/apache-zeppelin-hdp-2-4/ Which says that the second Zeppelin technical preview. This technical preview works with HDP 2.4 and comes with the following major features:
Notebook Import/Export LDAP Authentication Ambari Managed Installation Hopes this helps.
... View more
09-30-2017
06:34 AM
@Mohammad Shazreen Bin Haini To change the permission from Ambari, you can use Files View under Views sections.
... View more
09-30-2017
06:28 AM
Yes, it does. Can you share the below information: 1. Your current version 2. Error stack trace 3. Also confirm if Interactive Query and Acid Transactions are set to True from Hive -> Configurations in Ambari
... View more
09-30-2017
06:20 AM
@Vishal Sagar Please follow the steps mentioned in the doc https://hortonworks.com/tutorial/sandbox-deployment-and-install-guide/section/3/
... View more
09-30-2017
06:11 AM
1 Kudo
@Sreelakshmi Lingala Yes, please refer below links for more about them. https://community.hortonworks.com/articles/118786/log4j-settings-for-hdp-services.html https://community.hortonworks.com/articles/8882/how-to-control-size-of-log-files-for-various-hdp-c.html Let me know if this helps.
... View more
09-30-2017
06:01 AM
Ex : cluster name, state of the cluster etc
... View more
Labels:
- Labels:
-
Hortonworks Cloudbreak
08-17-2017
09:50 PM
@nshelke
/tmp/${user.name} in Hive 0.2.0 through 0.8.0 /tmp/hive-${user.name} in Hive 0.8.1 through 0.14.0 /tmp/hive in Hive 0.14.0 and later
... View more
06-01-2017
03:52 PM
@Jay SenSharma Thanks for it. And is there any other way round as well? Like for an particular rpmlib, can we find the list of HDP packages as well?
... View more
06-01-2017
02:28 PM
Is there a command or tool by which we can check the list of dependent libraries or files between hadoop and OS(environment)? For eg: When we install ambari on RHEL, we have to manually install "libtirpc-devel". Thusa, is there a way by which we can check the hadoop packages that are dependent on "libtirpc-devel".
... View more
Labels:
05-31-2017
01:45 PM
@Satish Sarapuri Thanks, but when I tried to check its behavior (expecting something like it would return only the duplicate records), but it returned every records in that table. Hence, wanted to know an simple implementation of it.
... View more