Member since
05-02-2019
319
Posts
145
Kudos Received
59
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6998 | 06-03-2019 09:31 PM | |
1671 | 05-22-2019 02:38 AM | |
2123 | 05-22-2019 02:21 AM | |
1321 | 05-04-2019 08:17 PM | |
1628 | 04-14-2019 12:06 AM |
09-04-2017
09:18 PM
The 1-day essentials course is available for free at http://public.hortonworksuniversity.com/hdp-overview-apache-hadoop-essentials-self-paced-training in a self-paced format. Enjoy and good luck on the exam!
... View more
09-04-2017
09:15 PM
As https://hortonworks.com/services/training/certification/hca-certification/ states, "the HCA certification is a multiple-choice exam that consists of 40 questions with a passing score of 75%". Good luck!!
... View more
07-31-2017
12:34 PM
Yep, this could work, but for a big cluster I could imagine this being time-consuming. The initial recursive listing (especially since it will represent down to the file level) could be quite large for any file system of any size. The more time-consuming effort would be to run the "hdfs dfs -count" command over and over and over. But... like you said, this should work. Preferably, I'd want the NN to just offer a "show me all quoto details" or at least just "show me directories w/quotas". Since this function is not present, Maybe there is a performance hit for NN to quickly determine this that I'm not considering as seems lightweight to me. Thanks for your suggestion.
... View more
07-31-2017
09:20 AM
1 Kudo
The HDFS Quota Guide, http://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html, shows how to list details of quotas at a specific directory where the quota is listed, but is there a way to see all quotas with one command (or at least a way to list all directories that have quotas, something like the way you can list all snapshottable dirs, which I could then programmatically iterate through and check individual quotas? My "hunch" was that I could just check on the / directory and see a roll-up of the two specific quotas showed first, but as expected it is only showing the details of that dir's quota (if it exist). [hdfs@node1 ~]$ hdfs dfs -count -v -q /user/testeng
QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME
400 399 none inf 1 0 0 /user/testeng
[hdfs@node1 ~]$ hdfs dfs -count -v -q /user/testmar
QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME
none inf 134352500 134352500 1 0 0 /user/testmar
[hdfs@node1 ~]$
[hdfs@node1 ~]$
[hdfs@node1 ~]$ hdfs dfs -count -v -q /
QUOTA REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE PATHNAME
9223372036854775807 9223372036854775735 none inf 49 23 457221101 /
[hdfs@node1 ~]$
... View more
Labels:
- Labels:
-
Apache Hadoop
07-06-2017
01:18 PM
Great question and unfortunately, I don't think there is a well agreed upon formula/calculator out there as "it depends" is so often the rule. Some considerations are that the datanode doesn't really know about the directory structure; it just stores (and copies, deletes, etc) blocks as directed by the datanode (often indirectly since clients write actual blocks). Additionally, the checksums at the block level are actually stored on disk alongside the files for the data contained in a given block. It looks like there's some good info in the following HCC Q's that might be of help to you. https://community.hortonworks.com/questions/64677/datanode-heapsize-computation.html https://community.hortonworks.com/questions/45381/do-i-need-to-tune-java-heap-size.html https://community.hortonworks.com/questions/78981/data-node-heap-size-warning.html Good luck and happy Hadooping!
... View more
06-22-2017
06:37 PM
Instead of year as (year:int) try (int) year as castedYear:int
... View more
06-07-2017
08:06 PM
Excellent. Truthfully, the case sensitivity is a bit weird in Pig -- kind of like the rules of the English language. Hehe!
... View more
06-06-2017
03:25 PM
Regarding the on-demand offerings we have, we do have an HDP Essentials course, but currently it is only available via the larger, bundled Self-Paced Learning Library described at https://hortonworks.com/self-paced-learning-library/. We are working towards offering individual on-demand courses, but not there yet. You could register for it individually via our live (remote in most cases) delivery options shown at https://hortonworks.com/services/training/class/hadoop-essentials/.
... View more
06-04-2017
08:52 PM
I'd raise a separate HCC question for help with that. That way we'll get the targeted audience and your Q's won't be buried within this one that most will read as a cert question. That's a fancy way to say I haven't set that particular version up myself and wouldn't be much help until after I got my hands dirty with it. 😉
... View more
06-04-2017
04:38 AM
It did the trick for me. I sure hope it helps out @Joan Viladrosa, too! Thanks, Sriharsha!
... View more