Member since
06-24-2016
111
Posts
8
Kudos Received
0
Solutions
06-29-2017
06:17 PM
If the agent node is registered and is available, you can login into that node and use "hst unregister-agent"
... View more
08-11-2017
01:12 AM
In my case, I navigate to the folder /data/user/flamingo/.ivy2/jars
...
Ivy Default Cache set to: /data/user/flamingo/.ivy2/cache
The jars for the packages stored in: /data/user/flamingo/.ivy2/jars
...
And copy all the jars below to the directory you want to store jars, then execute the spark command like: SPARK_MAJOR_VERSION=2 bin/spark-shell --jars="/path/to/jars" Then the result seems worked!
... View more
04-25-2017
03:30 PM
1 Kudo
Thanks all for your responses. Once again i reassign ownership. It works!!! ## hdfs dfs -chown -R admin:hadoop /user/admin
... View more
04-24-2017
08:35 AM
Hi @Peter Kim Rack Awareness is only used by HDFS to ensure it accurately places replicas of data off-rack, therefore it only needs to have the datanodes listed. Hope that helps!
... View more
03-25-2017
03:51 PM
You need to update in /etc/hosts as well. 1) /etc/sysconfig/network :- this used while internode communication 2) /etc/hosts while client connection this is going to use. We need to set FQDN names on both file, this idle way
... View more
11-21-2016
01:29 AM
--split-by option is possible for text column after add sqoop-site.xml in ambari or add that option in command line. Like this. I think oracle record count is not relevant splitted file size. Because actual file size depends on column count and column type and column value size per one record. And here's my interesting sqoop import results. One file total size : 2.2 GB sqoop import ... --direct --fetch size 1000 --num-mappers 10 --split-by EMP_NO (TEXT) 0 bytes each 3 mappers, and 1.1GB to 1 mapper. and re-test with same value except below option. --split-by REEDER_ID (NUMBER) In my opinion, Sqoop mappers only parallel processing without regard to the file size for selected query results in oracle record, these are not evenly split file size. Also --split-by with NUMBER TYPE COLUMN option is useful than TEXT TYPE COLUMN that is not accurate for splitted file size.
... View more
10-04-2016
01:28 AM
Thanks Ashnee. I didn't notice that.
... View more
07-17-2016
06:21 PM
1 Kudo
Hi @Peter Kim, The NameNode selects a set of DataNodes for placing replicas of a newly allocated block. Each DataNode independently selects the target disk for its replica using a round robin policy. So replica placement looks like your case 1. i.e. Case1. BlockPool - blk_,,,,,, blk_...meta -> Datnode1 - disk1 | Datnode8 - disk2 | Datanode3 - disk6 ...... There is no good way to redistribute blocks across disks on a DN as @Hari Rongali mentioned. However a Disk Balancer feature is under development to address this use case. Also if I understand correctly you have two DN storage directories on one physical volume. We do not recommend doing that as it will affect your performance. You should have a one-one relation between storage directories and physical volumes (assuming you are using disks in the recommended JBOD configuration)
... View more
07-10-2016
07:55 PM
@Peter Kim great! Please mark the answer as accepted to close the issue...so it doesn't show up on our list of open questions
... View more
- « Previous
-
- 1
- 2
- Next »