Member since
10-24-2015
207
Posts
18
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4438 | 03-04-2018 08:18 PM | |
4330 | 09-19-2017 04:01 PM | |
1809 | 01-28-2017 10:31 PM | |
976 | 12-08-2016 03:04 PM |
03-05-2018
02:46 PM
"i put a csv file into hdfs location and do an alter table to add that new location to the partition". Can you please explain this operation?
... View more
02-14-2018
04:13 PM
1 Kudo
never mind, its gone automatically... probably yesterday deletion was still happening after going to trash
... View more
02-09-2018
05:25 PM
Hi: Ranger usersync syncs users from various sources to make these users available during security policy authoring via Ranger UI. At the time of resource access, enforcement of policies is performed by Ranger plugins which depend on the actual service (for example HiveServer2 in case of Hive plugin, HDFS Namenode in case of HDFS plugin) to pass the identity of the user and the groups they belong to. To answer your question, sync source used for ranger usersync does not really affect the actual access enforcement. As long as the users in your text file are consistent with the real user source (LDAP/Unix or AD), ranger policies will work fine. Hope this helps.
... View more
02-09-2018
02:22 AM
@PJ Yes, its the same even for Userid's but make sure that user doesn't belongs to any other groups. Even if he belongs the 1st policy will get higher priority. Hope this helps.
... View more
02-06-2018
06:26 PM
Hi @PJ, the honest truth is there is no good reason not to use ORC format. You can use another format like Parquet but it won't provide ACID, LLAP cache, or the same level of performance. I would say the decision is similar to not using indexes in a relational system or not running statistics. ORC is simply best practice for high performance data warehousing in Hive. Keep in mind that LLAP will allow you to cache raw text files. This may be an option if you have some strict SLA preventing you from incurring the conversion delay of the text file to ORC.
... View more
08-22-2018
10:03 PM
Is there a retention we can set for these staging directories in ambari? Seems like they are not cleaning up automatically
... View more
01-15-2018
05:50 AM
@PJ This might be due to io issue on JN host "Remote journal x.x.x.x:8485". Is it always the same JN which is lagging at failure? If so you should check IO load on this machine using iotop for instance. I can also be the result of a very large amount of transactions. What is the value of dfs.namenode.accesstime .precision?
... View more
08-17-2018
05:39 PM
See https://community.hortonworks.com/questions/212611/hivepartitionssmall-filesconcatenate.html
... View more
12-22-2017
02:54 AM
@bkosaraju Thanks a lot, the splitting part works.... but i am still getting only the first match ... how do i get all matches?
... View more
10-25-2017
03:41 AM
@PJ Yeah, it might be that case.Because if you are having large number of records then it will take a lot of time to convert ORC data to csv format and if you compare these two process executing query with insert overwrite directory will perform much faster with no issues and also we can keep what ever delimiter we need and we don't need to worry about size of the data.
... View more