Member since
09-15-2015
457
Posts
507
Kudos Received
90
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
16847 | 11-01-2016 08:16 AM | |
12459 | 11-01-2016 07:45 AM | |
11376 | 10-25-2016 09:50 AM | |
2436 | 10-21-2016 03:50 AM | |
5095 | 10-14-2016 03:12 PM |
01-23-2016
05:59 AM
Agree! Upgrades are more complicated than just adding a service or changing some configuration. From an ops. perspective I want to see whats happening and control the upgrade process.
... View more
01-21-2016
06:14 AM
2 Kudos
I dont think HBase should be a DataLake (storing many files with different sizes and formats), but you can certainly use HBase to store the content of your small files (depending on the content, whats in those files?). HBase is massively scalable, look at this example https://www.facebook.com/notes/facebook-engineering/the-underlying-technology-of-messages/454991608919/ Facebook is storing billions of messages in their HBase (Hydrabase) setup and Bloomberg is using HBase to store TB of data and respond to about 5billion requests per day (http://www.slideshare.net/HBaseCon/case-studies-session-4a-35937605)
... View more
01-21-2016
06:05 AM
2 Kudos
Hive-Servers are managed via Zookeeper, so you can connect to Zookeeper ... zookeeper-client -server <zookeeper-server> ...and read the "hiverserver2" Znode (Note: the Znode is configured via hive.server2.zookeeper.namespace) ls /hiveserver2 As a result you get all the available Hive-Servers [serverUri=horton02.example.com:10000;version=1.2.1.2.3.2.0-2950;sequence=0000000014, serverUri=horton03.example.com:10000;version=1.2.1.2.3.2.0-2950;sequence=0000000015]
... View more
01-20-2016
06:50 PM
1 Kudo
Thats not possible, you need some kind of middleware between your frontend (html/jquery) and your data service (hive). So you basically have to create a backend, e.g. with Spring or Play, which is taking requests from your frontend, querying hive and sending the result back to your frontend as soon as the Hive query was executed. You can also use ODBC. Take a look at this http://hortonworks.com/hadoop-tutorial/how-use-excel-2013-to-analyze-hadoop-data/ Is the website (html/jquery) only used to display data for a single user or will this be something like, one of many users visits a private page and individual data is pulled from hive and displayed on the frontend.
... View more
01-20-2016
06:35 PM
13 Kudos
@Vipin Rathor Great question 🙂 I have implemented a script at one of my customer that is actually adding policies and hdfs directories automatically as soon as a new users joins an AD group, so here is the part about how to use the RestAPI of Ranger to add policies. HDFS Policy Template: {
"policyName": "name_of_policy",
"resourceName": "/path1,/path2/blub",
"description": "",
"repositoryName": "",
"repositoryType": "hdfs",
"isEnabled": "true",
"isRecursive": "true",
"isAuditEnabled": "true",
"permMapList": [{
"groupList": ["somegroup"],
"permList": ["Read","Execute", "Write", "Admin"]
}]
}
Curl: curl -iv -u <user>:<password> -d @<policy payload> -H "Content-Type: application/json" -X POST http://<RANGER-Host>:6080/service/public/api/policy/ Hive Policy Template: {
"policyName":"name_of_policy",
"databases":"db1,db2",
"tables":"mytable,yourtable",
"columns":"",
"udfs":"",
"description":"",
"repositoryName":"",
"repositoryType":"hive",
"tableType":"Inclusion",
"columnType":"Inclusion",
"isEnabled":"true",
"isAuditEnabled":"true",
"permMapList": [{
"groupList": ["somegroup"],
"permList": ["Select"]
}]
}
Curl: curl -iv -u <user>:<password> -d @<policy payload> -H "Content-Type: application/json" -X POST http://<RANGER-Host>:6080/service/public/api/policy/ Getting Policies I just tested the Rest API to get some of my policies from Ranger, it worked. Make sure the Policy ID is valid, otherwise you'll get a "Data not found" error. Curl curl -iv -u <user>:<password> -H "Content-type:application/json" -X GET http://horton01.example.com:6080/service/public/api/policy/2 Result: {
"id":2,
"createDate":"2015-11-21T07:03:21Z",
"updateDate":"2015-12-08T05:54:24Z",
"owner":"Admin",
"updatedBy":"Admin",
"policyName":"Ranger_audits",
"resourceName":"/apps/solr/ranger_audits",
"description":"",
"repositoryName":"bigdata_hadoop",
"repositoryType":"hdfs",
"permMapList":[
{
"userList":[
"solr"
],
"groupList":[
],
"permList":[
"Read",
"Write",
"Execute"
]
}
],
"isEnabled":true,
"isRecursive":true,
"isAuditEnabled":false,
"version":"5",
"replacePerm":false
}
Let me know if you have any questions
... View more
01-20-2016
05:29 AM
Awesome, good to hear. Good Luck with your Coursera course 🙂
... View more
01-19-2016
06:13 PM
2 Kudos
Unfortunately, this is one of the remaining Yarn components that does not support HA at the moment. However there are already plans for a new Timeline Server (v2), which will be more scalable and reliable. If your Timeline Server is unavailable the client will retry to publish the application data a couple of times before its giving up. This can be configured using "yarn.timeline-service.client.max-retries" (defaults to 30) Check out this page https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/TimelineServer.html
... View more
01-19-2016
06:15 AM
3 Kudos
I am not familiar with this Coursera course and Hadoop setup. What course is this? You are getting a "permission denied"-error because you are trying to access a folder that is owned by the hdfs-user and the permissions do not allow write access from others. A) You could use the HDFS-user to run your application/script su hdfs or export HADOOP_USER_NAME=hdfs B) Change the owner of the mp2-folder (note: to change the owner you have to be a superuser or the owner => hdfs) hdfs dfs -chown -R <username_of_new_owner> /mp2
... View more
01-19-2016
06:03 AM
1 Kudo
Could you post some of you heap configurations? How much memory is available on the machine? OOM error usually means the heap configuration is not correct or their is not enough memory available on the machine. You also might want to check the open files limit (ulimit -a), if its too low it can cause OOM errors. (see https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_installing_manually_book/content/ref-729d1fb0-6d1b-459f-a18a-b5eba4540ab5.1.html) Even though you might be able to run Hadoop on a 32bit system, I wouldn't recommend it. You should use a 64bit system (see http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_installing_manually_book/content/meet-min-system-requirements.html)
... View more