Member since
09-24-2015
816
Posts
488
Kudos Received
189
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3173 | 12-25-2018 10:42 PM | |
| 14192 | 10-09-2018 03:52 AM | |
| 4763 | 02-23-2018 11:46 PM | |
| 2481 | 09-02-2017 01:49 AM | |
| 2912 | 06-21-2017 12:06 AM |
04-24-2016
04:06 AM
Thanks for the code! We'll definitely keep this in mind, but for our task at hand we don't have access to Spark source, so we'll go for GraphiteSink or JmxSink.
... View more
04-23-2016
03:41 AM
What's your max counter now, what does the error message say? You can try to increase tez.counters.max, Tez default is 2000, but in the latest version of Ambari it's set to 10000. Also, make sure you are using Pig-0.15 packaged in one of the latest versions of HDP. In Pig-0.14 tez_local mode was unstable. So, you can change tez.counters.max in Ambari or set it per Pig run: pig -D tez.counters.max=10000 -x tez_local By the way, what happens if you run your command in Tez mode, on a cluster?
... View more
04-22-2016
05:03 AM
Hi @drussell and @Andrew Grande, thank you for your responses. If Camus is dead, is Goblin the best way to move data from Kafka to HDFS? Camus has some nice features like topic discovery, partitioning, load balancing etc. I hope Goblin offers them too.
... View more
04-22-2016
04:47 AM
3 Kudos
What's the best way to monitor Spark jobs? SHS provides some information, but not in so user friendly manner. Has anybody tried Prometheus and Grafana. Spark is running on Yarn, and 80% of cluster jobs/apps are based on Spark.
... View more
Labels:
- Labels:
-
Apache Spark
04-21-2016
05:27 AM
1 Kudo
What's the status of Camus? It has been considered a new, great way to transfer and sync data from Kafka to HDFS. Do we have any working examples, and what's the prevailing opinion. Is it fully compatible with Kafka packaged in HDP? Because it's distributed together with Confluent version of Kafka. And, finally there is now Goblin, is it just a new name for Camus or a new project?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Kafka
04-19-2016
01:33 AM
You can find start-up commands for all HDP-2.3.0 services in the so-called Non-Ambari Cluster Installation Guide. For example, for HBase master and Region servers they are on this page. There are such guides for every HDP versions, but commands to start basic services are more or less the same across all recent versions.
... View more
04-18-2016
10:10 AM
2 Kudos
Next try: Login as user hdfs, and do "hdfs dfs -ls /apps/hbase/data/WALs". Move all directories (5 of them?) that refer to old port 60020 to a folder under /user/hdfs and restart the HBase master. In the same list, identify any directories having "splitting" in their name. If there are any, check that they contain only one file each, with "meta" in its name. Move all those "splitting" directories to /user/hdfs, and restart HBase.
... View more
04-18-2016
08:08 AM
1 Kudo
HBase ports changed in HDP-2.3.4, details here. And region servers has logical names which include RS port. An example RS name: "sandbox.hortonworks.com,16020,1460965964168". Before, the RS port was 60020, and now it's 16020. So, if you have 5 machines running RS, before and after the upgrade there are 10 RSs to be taken care of, and in your case HBase master may still think that RS names with 60020 are still being used. Restarting HBase (or just restarting HBase master) is supposed to remove them, and solve your issues. Before and after the restart you can check Regions servers in your Hbase Web UI (HBase --> Quick Links).
... View more
04-18-2016
05:56 AM
1 Kudo
For Spark on HDP, please check HDP Spark Guide. For general Spark programming check Spark Programming Guide with all examples given in Scala, Java, and Python. If you want to program in Java you need a development environment like Eclipse or IntelliJ.
... View more