Member since
09-18-2015
3274
Posts
1159
Kudos Received
426
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 45456 | 02-09-2016 06:13 PM |
12-23-2015
11:07 AM
5 Kudos
Original post A web-based notebook that enables interactive data analytics.
You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more. In few words " It's really cool tool to interact with Data" HDFS, Hive, Spark, Kylin, Flink This is from HDP latest Sandbox Continue to Blog 3 on NiFi Let's analyze Starwars data Hive Demo Table definition and Top 10 users based on tweet count Top 10 users who used the word "love" in #starwars Word hate used in #starwars Word yoda used in #starwars You can see the Tweet sent by my id in Zeppelin output. Spark I used this for the sentiment analysis. Replace %hive with %sql (Assuming that you have setup the Zeppelin correctly) Links Zeppelin Hortonworks and Zeppelin Happy Hadooping!!!
... View more
Labels:
12-12-2015
01:29 PM
4 Kudos
Download connector http://hortonworks.com/hdp/addons/ **** Extract tar file **** **** Copy jar into sqoop-client/lib *** cp *.jar /usr/hdp/current/sqoop-client/lib/ **** Create tables in Teradata **** We will be importing data from /tmp/test , HDFS location **** Sqoop **** sqoop export --connect jdbc:teradata://Terdatahost/Database=DBName --connection-manager org.apache.sqoop.teradata.TeradataConnManager --username user --password passwd --table test --export-dir /tmp/test/ --batch
... View more
Labels:
11-26-2015
04:21 PM
1 Kudo
The Hadoop Ecosystem Table https://hadoopecosystemtable.github.io/
... View more
Labels:
11-26-2015
12:15 PM
5 Kudos
https://github.com/apache/ambari/blob/trunk/ambari-server/src/main/resources/host_scripts/alert_disk_space.py vi /var/lib/ambari-server/resources/host_scripts/alert_disk_space.py # defaults in case no script parameters are passed MIN_FREE_SPACE_DEFAULT = 5000000000L #5GB PERCENT_USED_WARNING_DEFAULT = 50 PERCENT_USED_CRITICAL_DEFAULT = 80 You can change the above parameter to avoid alerting at 80% threshold reached in case it needs to be aletered to 85 or 90
... View more
Labels:
11-26-2015
11:49 AM
2 Kudos
Caused by: java.lang.OutOfMemoryError: Java heap space This particular case is related to "Reducer tasks of hive job fails with Out Of Memory error during shuffle fetcher stage" Fix:
Increase hive.tez.container.size if it is set too low. tez.runtime.shuffle.memory.limit.percent from default value 0.7 Changed to 0.4
Decrease tez.runtime.shuffle.fetch.buffer.percent from default 0.25 to .15 if needed. (Different values were tested between the range of 0.25 to 0.10)
... View more
Labels:
11-26-2015
10:59 AM
1 Kudo
Use case There are 2 groups Analytics and DW. We want to split the cluster resources between these 2 groups. User - neeraj belongs to Analytics group. User - dwuser belongs to DW group User neeraj is not allowed to use Default and dwuser queue. Be default, all the jobs submitted by user neeraj must go to it's assigned queue. User dwuser is not allowed to use Default and Analytics queue. By default, all the jobs submitted by user dwuser must go to it's assigned queue. Environment HDP 2.3 (Hortonworks Data Platform) and Ambari 2.1 This tutorial completely independent of Hadoop distribution. Yarn is must i,e Hadoop 2.x I will be using Capacity Scheduler view to configure queues. yarn.scheduler.capacity.maximum-am-resource-percent=0.2
yarn.scheduler.capacity.maximum-applications=10000
yarn.scheduler.capacity.node-locality-delay=40
yarn.scheduler.capacity.queue-mappings=u:neeraj:Analytics,u:dwuser:DW
yarn.scheduler.capacity.queue-mappings-override.enable=true
yarn.scheduler.capacity.root.accessible-node-labels=*
yarn.scheduler.capacity.root.acl_administer_queue=yarn
yarn.scheduler.capacity.root.acl_submit_applications=yarn
yarn.scheduler.capacity.root.Analytics.acl_administer_queue=yarn
yarn.scheduler.capacity.root.Analytics.acl_submit_applications=neeraj
yarn.scheduler.capacity.root.Analytics.capacity=60
yarn.scheduler.capacity.root.Analytics.maximum-capacity=60
yarn.scheduler.capacity.root.Analytics.minimum-user-limit-percent=100
yarn.scheduler.capacity.root.Analytics.ordering-policy=fifo
yarn.scheduler.capacity.root.Analytics.state=RUNNING
yarn.scheduler.capacity.root.Analytics.user-limit-factor=1
yarn.scheduler.capacity.root.capacity=100
yarn.scheduler.capacity.root.default.acl_administer_queue=yarn
yarn.scheduler.capacity.root.default.acl_submit_applications=yarn
yarn.scheduler.capacity.root.default.capacity=10
yarn.scheduler.capacity.root.default.maximum-capacity=100
yarn.scheduler.capacity.root.default.state=RUNNING
yarn.scheduler.capacity.root.default.user-limit-factor=1
yarn.scheduler.capacity.root.DW.acl_administer_queue=yarn
yarn.scheduler.capacity.root.DW.acl_submit_applications=dwuser
yarn.scheduler.capacity.root.DW.capacity=30
yarn.scheduler.capacity.root.DW.maximum-capacity=30
yarn.scheduler.capacity.root.DW.minimum-user-limit-percent=100
yarn.scheduler.capacity.root.DW.ordering-policy=fifo
yarn.scheduler.capacity.root.DW.state=RUNNING
yarn.scheduler.capacity.root.DW.user-limit-factor=1
yarn.scheduler.capacity.root.maximum-capacity=100
yarn.scheduler.capacity.root.queues=Analytics,DW,default [root@nsfed01 ~]# su - neeraj [neeraj@nsfed01 ~]$ mapred queue -showacls 15/08/18 14:45:03 INFO impl.TimelineClientImpl: Timeline service address: http://nsfed03.cloud.hortonworks.com:8188/ws/v1/timeline/ 15/08/18 14:45:03 INFO client.RMProxy: Connecting to ResourceManager at nsfed03.cloud.hortonworks.com/172.24.64.22:8050 Queue acls for user : neeraj Queue Operations ===================== root Analytics SUBMIT_APPLICATIONS DW default [neeraj@nsfed01 ~]$ [root@nsfed01 ~]# su - neeraj [neeraj@nsfed01 ~]$ yarn jar /usr/hdp/2.3.0.0-2557/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 20 1000000009 Number of Maps = 20 Samples [root@nsfed03 yarn]# su - dwuser [dwuser@nsfed03 ~]$ yarn jar /usr/hdp/2.3.0.0-2557/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 20 1000000009 Number of Maps = 20 CS view
... View more
Labels:
11-26-2015
10:55 AM
3 Kudos
yum install expect*
#!/usr/bin/expect
spawn ambari-server sync-ldap --existing
expect "Enter Ambari Admin login:"
send "admin\r"
expect "Enter Ambari Admin password:"
send "admin\r"
expect eof
... View more
Labels:
11-23-2015
11:37 PM
Thanks @Chris Nauroth
... View more