Member since
12-09-2015
35
Posts
13
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4175 | 06-27-2016 01:05 AM |
10-11-2023
09:46 AM
@Srinivascnu As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post. Thanks.
... View more
04-10-2017
10:38 PM
Thanks for clearing it up @Bryan Bende, much appreciated.
... View more
10-10-2016
01:21 AM
1 Kudo
It refers to active Name Node. For distcp you are providing 2 paths, source and target files/directories, both on HDFS of respective clusters. With some more settings you can refer to NN name service, meaning that you don't need to care which NN is active.
... View more
07-14-2016
04:11 AM
@Emily Sharpe thanks for the insights.
... View more
03-29-2016
07:36 AM
@Emily Sharpe, If the original question is answered then please accept the best answer.
... View more
05-17-2016
01:30 PM
1 Kudo
Hi @Neeraj Sabharwal, I am trying to save my output results in Spark using saveAsTextFile(""). The result of which is multiple parts (part-0000, part-00001 ...so on) along with .crc files in the output directory. Do you have any idea how can I avoid forming the .crc files?
... View more
12-10-2015
03:08 PM
4 Kudos
I believe generally hard coding parallel is a bad idea in your pig script. With Parallel 1, you are effectively having 1 reducer perform the job. This can affect scale and performance. I would allow default parallelism and use the hdfs dfs -getmerge option.
For an input point of view, Here is a tip to Combine Small files.
... View more
12-11-2015
05:37 AM
Thanks to @Deepesh for the workaround. Also wanted to add (for info) that these steps will not be required after HDP upgrade. We will use ALTER TABLE activeTable CONCATENATE;
to combine the many smaller ORC files into fewer larger ones (possible from Hive 0.14+). https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionConcatenate
... View more