Member since
09-25-2015
82
Posts
93
Kudos Received
17
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3826 | 06-06-2017 09:57 AM | |
1056 | 03-01-2017 10:26 AM | |
1079 | 11-22-2016 10:32 AM | |
899 | 08-09-2016 12:05 PM | |
1570 | 08-08-2016 03:57 PM |
08-08-2016
02:31 PM
2 Kudos
@Yaron Idan
you can usually choose which machine to install your service when you get to the "Assign Masters" page in Ambari - there should be a dropdown box when you get to that page. Did you maybe skip it by accident? Here's an example of my lab cluster (I only have one node, but the principle is the same)
... View more
05-11-2016
01:05 PM
3 Kudos
This is a common point of confusion, so I did some tests which I hope will clarify. TEST 1:
Setup:
1. Create a user directory /user/myuser with 777 permissions in HDFS
2. Make a policy in ranger that allows user mktg1 only read access to /user/myuser Result:
1. It always allows mktg1 to write 2. Ranger Audit says "Access Enforcer" is hadoop-acl
This is expected behaviour EXPLANATION:
The way a Ranger policy normally works is it searches until it either runs out of options, or it allows access. So, in this case, it first checks the Ranger policy, sees it can't write, then checks HDFS permissions, sees it CAN write and then allows the write.
In order to avoid this situation, you must totally lock down filesystem permissions. That is something like chmod 700. Then you can administer access via Ranger policies.
Ranger policies can only allow access; if nothing allows access (including by default HDFS permissions) then it will deny. TEST 2:
Setup:
1. Create a user directory /user/myuser with 000 permissions in HDFS
2. Make a policy in ranger that allows user mktg1 read+execute access to /user/myuser
Result:
1. As the user it1: [it1@sandbox conf]$ hadoop fs -ls /user/myuser
ls: Permission denied: user=it1, access=READ_EXECUTE, inode="/user/myuser":hdfs:hdfs:d--------- 2. As the user mktg1: [mktg1@sandbox conf]$ hadoop fs -ls /user/myuser
Found 10 items
-rw-r--r-- 1 root hdfs 529 2015-06-24 12:30 /user/myuser/test.csv
“Access Enforcer” is xasecure-acl in the Ranger Audit UI
3. As the user mktg1: [mktg1@sandbox ~]$ hdfs dfs -put test.txt /user/myuser
put: Permission denied: user=mktg1, access=WRITE, inode="/user/myuser":hdfs:hdfs:d---------
File system permissions mean that no one is allowed to access the directory, but the Ranger policy allows mktg1 to read it, but not write.
... View more
05-11-2016
11:47 AM
1 Kudo
Hi @Vaibhav Yadav, I won't comment on the SQL optimisation (perhaps someone else can), but you can do a lot of tuning to improve Hive performance and there are a lot of articles in the community to advise you. First of all, you should be using Tez as the hive execution engine instead of MapReduce and then you can do a lot of further tweaks: see for example this article which provides some advice on tuning Tez. I'll leave the thread open for others to comment further, but this will definitely get you started!
... View more
05-05-2016
03:23 PM
1 Kudo
It's likely because there are different configuration files (yarn-site.xml and mapred-site.xml), which need to be configured separately. Just like Tez is listed as its own service so you can configure tez-site.xml, but it's really just a client.
... View more
04-15-2016
03:33 PM
3 Kudos
Hi @krishna sampath, do you want to read the classpath or add to it? To get the whole classpath, you just need to run this command: hadoop classpath If you want to add to it, you can do so in hadoop-env.sh. If you're using Ambari, you must update it in Ambari in the hadoop-env section of the HDFS configs and not on the filesystem directly, as Ambari will overwrite any changes you make to the config files directly.
... View more
04-06-2016
01:15 PM
2 Kudos
Hi Raja, the Reflect UDF is actually blacklisted by default when running queries through HiveServer2 (beeline, ODBC, JDBC connections), as it can be a security risk. See this JIRA, for example: HIVE-8893. This works in hive CLI, because the hive CLI does not use HiveServer2. The JIRA is available in HDP 2.2.4 and higher. You can check this by running this in beeline: set hive.server2.builtin.udf.blacklist; reflect will probably be listed there. You can remove it from the blacklist property and also make sure this property is empty: hive.server2.builtin.udf.whitelist Apply the changes and restart HiveServer2 and you should see the reflect UDF working. See Apache Hive documentation on these properties for further info.
... View more
04-04-2016
12:43 PM
1 Kudo
Hi @Alex Raj, to clarify, do you just want the "131" output to a file? You can try this in your hql: INSERT OVERWRITE DIRECTORY '/user/user1/results' select count(*) from sample_table; This will write the output of your query into the results directory on HDFS.
... View more
03-30-2016
04:07 PM
2 Kudos
Hi @Richard Xu, there is a known issue in HDP 2.3.4.0 where the application timeline server leaks threads. See: https://issues.apache.org/jira/browse/TEZ-3017 https://issues.apache.org/jira/browse/HIVE-12766 The thread leak is fixed in HDP 2.3.4.7, so the recommendation is to upgrade to that version or higher. You can also disable ATS 1.5 and downgrade to ATS 1.0, but the upgrade is recommended.
... View more
03-30-2016
03:14 PM
2 Kudos
Hi @Mohana Murali Gurunathan - you really should use Ambari, as it will make your life significantly easier when managing your Ranger installation. If you choose to continue with the non-Ambari installation, please follow our documentation in full: here As you can see, you have to do yum install ranger_<version> and usersync, kms and the plugins are all separate components which need to be installed.
... View more