About MindGlass

czounaidou · ‎06-29-2017

If the agent node is registered and is available, you can login into that node and use "hst unregister-agent"

1qaz5222 · ‎08-11-2017

In my case, I navigate to the folder /data/user/flamingo/.ivy2/jars ... Ivy Default Cache set to: /data/user/flamingo/.ivy2/cache The jars for the packages stored in: /data/user/flamingo/.ivy2/jars ... And copy all the jars below to the directory you want to store jars, then execute the spark command like: SPARK_MAJOR_VERSION=2 bin/spark-shell --jars="/path/to/jars" Then the result seems worked!

stockblog · ‎10-16-2017

Thanks! Thats really usefull!

varun_rathinam · ‎04-25-2017

Thanks all for your responses. Once again i reassign ownership. It works!!! ## hdfs dfs -chown -R admin:hadoop /user/admin

drussell · ‎04-24-2017

Hi @Peter Kim Rack Awareness is only used by HDFS to ensure it accurately places replicas of data off-rack, therefore it only needs to have the datanodes listed. Hope that helps!

shivkumar82015 · ‎03-25-2017

You need to update in /etc/hosts as well. 1) /etc/sysconfig/network :- this used while internode communication 2) /etc/hosts while client connection this is going to use. We need to set FQDN names on both file, this idle way

MindGlass · ‎11-21-2016

--split-by option is possible for text column after add sqoop-site.xml in ambari or add that option in command line. Like this. I think oracle record count is not relevant splitted file size. Because actual file size depends on column count and column type and column value size per one record. And here's my interesting sqoop import results. One file total size : 2.2 GB sqoop import ... --direct --fetch size 1000 --num-mappers 10 --split-by EMP_NO (TEXT) 0 bytes each 3 mappers, and 1.1GB to 1 mapper. and re-test with same value except below option. --split-by REEDER_ID (NUMBER) In my opinion, Sqoop mappers only parallel processing without regard to the file size for selected query results in oracle record, these are not evenly split file size. Also --split-by with NUMBER TYPE COLUMN option is useful than TEXT TYPE COLUMN that is not accurate for splitted file size.

MindGlass · ‎10-04-2016

Thanks Ashnee. I didn't notice that.

ArpitAgarwal · ‎07-17-2016

Hi @Peter Kim, The NameNode selects a set of DataNodes for placing replicas of a newly allocated block. Each DataNode independently selects the target disk for its replica using a round robin policy. So replica placement looks like your case 1. i.e. Case1. BlockPool - blk_,,,,,, blk_...meta -> Datnode1 - disk1 | Datnode8 - disk2 | Datanode3 - disk6 ...... There is no good way to redistribute blocks across disks on a DN as @Hari Rongali mentioned. However a Disk Balancer feature is under development to address this use case. Also if I understand correctly you have two DN storage directories on one physical volume. We do not recommend doing that as it will affect your performance. You should have a one-one relation between storage directories and physical volumes (assuming you are using disks in the recommended JBOD configuration)

abajwa · ‎07-10-2016

@Peter Kim great! Please mark the answer as accepted to close the issue...so it doesn't show up on our list of open questions

Online	Offline
Last Visited	‎06-25-2016 02:41 AM

Member Since	‎06-24-2016 10:23 AM
Last Visited	‎06-25-2016 02:41 AM
Posts	111
Kudos received	8

Cloudera Community

Re: How to delete smartsense registered agents in ...

Re: How can I change spark package repository?

Re: Hive Metastore does not start

Re: Hive Write permission denied

Re: Hadoop Rack-Awareness is only for datanode ser...

Re: What is the most idle way for a hadoop cluster...

Re: How to set equivalent output file size after s...

Re: How to delete ambari-server's stack version in...

Re: HDFS Datanode Replication Policy on multiple d...

Re: Zeppelin Bug in HDP 2.4.2.0-258