Member since
06-07-2016
923
Posts
322
Kudos Received
115
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2731 | 10-18-2017 10:19 PM | |
3088 | 10-18-2017 09:51 PM | |
12256 | 09-21-2017 01:35 PM | |
1009 | 08-04-2017 02:00 PM | |
1311 | 07-31-2017 03:02 PM |
10-12-2016
01:13 PM
@ARUN Please see the following link. This issue has been answered before. https://community.hortonworks.com/questions/11779/hbase-master-shutting-down-with-zookeeper-delete-f.html
... View more
10-11-2016
08:43 PM
@Sunile Manjee
Integration between Spark and HBase relies simply on HBaseContext which provides HBase configuration to Spark Executor. So, to answer which protocol is used, the answer is simple RPC. Please check following link for more details. https://hbase.apache.org/book.html#spark and here is the github link to HBase Spark module. https://github.com/apache/hbase/tree/master/hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark
... View more
10-11-2016
08:23 PM
@mohamed sabri marnaoui Is it hanging or just waiting in the queue to run?
... View more
10-10-2016
02:51 PM
@Vikram Rathod Well that should be easy. You use Apache Ranger to create different organization groups and set authorization permissions. At HDFS level, you can create directories like /region/US, /region/UK, /region/APAC and then respective subdirectories to separate data. Each of these directories and their subdirectories can have further granular level permissions using Ranger and configure the cluster with Atlas for auditing and lineage information. You can also use HDFS storage quotas if you want but it appears that to start with, you don't need that. As for resource distribution, use YARN.
... View more
10-10-2016
01:43 PM
1 Kudo
@Vikram Rathod Are you saying you will have just one cluster to serve all these regions? Your question has almost no details. Can you please share your requirements. Please remember that one cluster will not expand to more than one data center. If you will have one cluster for all regions, then you still just size based on your volume and SLAs and set the right expectations for users. for example, if your only cluster is in US then users in UK and APAC should expect slower response times due to network latency. I don't think it affects cluster size. Please provide more details, so we can help you answer.
... View more
10-08-2016
06:21 PM
@Cruz DSouza Let's start with the following. Check user permissions in your MySQL. login to MySQL shell and then see permissions in user table for user "hive". SELECT User, host from MySQL.user where user = 'hive' or try to find permissions for user hive and which hosts it can log in from. SHOW GRANTS for 'hive'@'%'; If you don't see permissions for user hive, specially for the host you are logging in from, you can run the following. Notice, this statement below, gives hive permission to run from any client host. You might not need this and in that case, customize according to your requirements GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'%';
... View more
10-06-2016
04:21 PM
Negative. If you check the Jira's, they are unresolved. We don't ship unresolved issues in our product. So, your only option right now is to download the patch and apply to your installation. That will affect support if you have that because you are applying a non hortonworks patch. I would suggest that you simply distcp the file and then compress it. You are only saving a step. It's not saving you any time or giving better performance.
... View more
10-06-2016
03:16 AM
@Tran Quyet Thang According to the hive documentation, "filepath can refer to a file (in which case Hive will move the file into the table) or it can be a directory (in which case Hive will move all the files within that directory into the table)." I don't think, wild cards are allowed in the path. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Loadingfilesintotables
... View more
10-05-2016
08:42 PM
@Vaibhav Kumar Two things here. 1. I don't understand your use of "a.id = b.id where b.id is null". When b.id is null, a.id and b.id will never be equal. However, it's your query and you probably know more about it, so you can ignore my comment if you know what I am talking about. 2. I think you need to use ROW_NUMBER function and then select the third row. This link describes usage of ROW_NUMBER() for SQL 2005 but it's the same for Hive.
... View more
10-05-2016
07:59 PM
2 Kudos
@Arun Reddy This feature is still not available in Hadoop by default. You can add a patch but distcp doesn't compress data. following JIRA will give you all the details including the patch you want to download. https://issues.apache.org/jira/browse/HADOOP-8065 Following is the new JIRA https://issues.apache.org/jira/browse/HADOOP-13114 --> use this one if you decide to apply the patch.
... View more