Member since
04-12-2016
30
Posts
12
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2175 | 05-02-2016 06:42 AM |
03-08-2017
02:27 PM
FYI: With above description we were able to upgrade to version 2.5.3 without any Kafka cluster downtime. We only had some issues with a Kafka client written in Go.
... View more
01-18-2017
04:56 AM
Hi @lraheja, our HDP 2.4 cluster was installed with Ambari. Hence we must use Ambari Upgrade Guide to perform the HDP 2.4 to HDP 2.5.0 upgrade. I don't think a manual upgrade is an option.
... View more
01-17-2017
12:22 PM
Hi, We are planning the rolling upgrade from HDP 2.4.0.0 to 2.5.3.0. No downtime during the upgrade is especially crucial for the Kafka cluster: Is the update of the Kafka brokers also rolling? Will clients (producer and cusumer) from Kafka 0.9.0.2.4 releases work with brokers from Kafka 0.10.0 releases? If the answer for 1) and/or 2) is No - what is the best practice to guarantee no downtime? Thank you in advance,
Christian
... View more
Labels:
- Labels:
-
Apache Kafka
07-05-2016
09:24 AM
1 Kudo
Below our findings: As shown in
the DDL above, bucketing is used in the problematic tables. Bucket number gets
decided according to hashing algorithm, out of 10 buckets for each insert 1
bucket will have actual data file and other 9 buckets will have same file name
with zero size. During this hash calculation race condition is happening when inserting
a new row into the bucketed table via multiple different threads/processes, due
to which 2 or more threads/processes are trying to create the same bucket file. In addition,
as discussed here, the current architecture is not really recommended as over the period of time there would be millions of files on HDFS,
which would create extra overhead on the Namenode. Also select * statement
would take lot of time as it will have to merge all the files from bucket. Solutions which solved both issues: Removed buckets from the two
problematic tables, hece the probability of race conditions will be very less Added hive.support.concurrency=true before the insert statements Weekly Oozie workflow that uses implicit Hive concatenate command on both tables to mitigate the small file problem FYI @Ravi Mutyala
... View more
05-31-2016
09:57 AM
Yes, we see this issue only when running multiple Oozie worklflows in parallel.
... View more
05-31-2016
09:55 AM
There is no KMS used in those szenarios.
... View more
05-31-2016
09:52 AM
Backup Hue /etc/init.d/hue stop
su - hue
mkdir ~/hue_backup
cd /var/lib/hue
sqlite3 desktop.db .dump > ~/hue_backup/desktop.bak
Backup the Hue Configuration cp -RL /etc/hue/conf ~/hue_backup
... View more
05-03-2016
07:18 AM
Hi, ShareLib concept is well described here Below an example that works with HDP 2.2.4 <workflow-app name="jar-test" xmlns="uri:oozie:workflow:0.4">
<start to="db-import"/>
<action name="db-import">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<command>list-tables --connect jdbc:mysql://<db-host>/hive --username hive --password hive</command>
<archive>/user/<username>/wf-test/lib/mysql-connector-java.jar</archive>
</sqoop>
<ok to="end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app> Hope it helps, Chris
... View more
05-03-2016
06:39 AM
@simram : What HDP version are you using? Is the Oozie service check in Ambari successfull?
... View more