Member since
01-09-2019
401
Posts
163
Kudos Received
80
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2082 | 06-21-2017 03:53 PM | |
3160 | 03-14-2017 01:24 PM | |
1991 | 01-25-2017 03:36 PM | |
3167 | 12-20-2016 06:19 PM | |
1591 | 12-14-2016 05:24 PM |
05-05-2016
06:06 PM
If it is a small cluster, you can skip passwordless ssh and use manual ambari-agent install. Steps for that are here. While this is not a solution to your passwordless ssh issue, this works very well on smaller clusters (I did manual registration on larger clusters too since there were policies of no passwordless ssh for su user).
... View more
05-04-2016
02:53 AM
Thank you @bsaini it worked great.
... View more
03-27-2019
12:46 AM
The link is invalid. JOAO's link is valid now.
... View more
07-05-2016
09:24 AM
1 Kudo
Below our findings: As shown in
the DDL above, bucketing is used in the problematic tables. Bucket number gets
decided according to hashing algorithm, out of 10 buckets for each insert 1
bucket will have actual data file and other 9 buckets will have same file name
with zero size. During this hash calculation race condition is happening when inserting
a new row into the bucketed table via multiple different threads/processes, due
to which 2 or more threads/processes are trying to create the same bucket file. In addition,
as discussed here, the current architecture is not really recommended as over the period of time there would be millions of files on HDFS,
which would create extra overhead on the Namenode. Also select * statement
would take lot of time as it will have to merge all the files from bucket. Solutions which solved both issues: Removed buckets from the two
problematic tables, hece the probability of race conditions will be very less Added hive.support.concurrency=true before the insert statements Weekly Oozie workflow that uses implicit Hive concatenate command on both tables to mitigate the small file problem FYI @Ravi Mutyala
... View more
10-08-2017
10:34 PM
easy to integrate NiFi -> Kafka -> Spark or Storm or Flink or APEX Also NiFi -> S2s -> Spark / Flink / ...
... View more
04-30-2016
12:54 AM
ulimit -n 8096 Try this and restart DN and NN and see if this works. I haven't seen your DN logs but it looks like you are running into max open files issue.
... View more
04-29-2016
01:08 PM
As @nyadav pointed out, you need to use the URL as jdbc:sqlserver://xx.xx.x.xxx:1433;databaseName=sample instead of the way that you are entereing for SQL Server. List databases worked since you haven't used a database in the jdbc URL there.
... View more
07-24-2017
02:42 PM
For a comparison between compression formats take a look at this link: http://comphadoop.weebly.com/
... View more
05-13-2016
06:52 PM
@alain TSAFACK Please try accepting the answer that answered your question. Avoid accepting your own answers unless you have done your research after asking the question and have an answer.
... View more
06-01-2016
07:31 AM
Thanks Ravi. This solved my rpoblem also.
... View more