Member since
01-25-2017
396
Posts
28
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
854 | 10-19-2023 04:36 PM | |
4396 | 12-08-2018 06:56 PM | |
5512 | 10-05-2018 06:28 AM | |
19998 | 04-19-2018 02:27 AM | |
20020 | 04-18-2018 09:40 AM |
10-08-2018
06:21 PM
Thanks @bgooley I solved this by upgrading os and kerberos version. It works fine for me now. Thanks for your help
... View more
10-06-2018
01:46 PM
Hi, solution is that in /var/hdfs-sockets/dn dn should not be created as directory. dn will be created automatically as a file. My mistake was becase I created dn also as directory and that is why I was getting an error. Thank you for assitance
... View more
10-05-2018
07:47 AM
awesome! thank you!
... View more
10-04-2018
11:26 PM
1 Kudo
I have figured out that this is coming from the third party tool, so it has nothing to do with the Simba driver. Thanks
... View more
09-25-2018
07:45 PM
@mdjedaini There is nothing to do with cloudera on this request as there are so many other tools are available in the market. I am not sure how big your environment. In general, those who are using big environments with huge nodes will use some tools like Chef, Puppet, Terraform, Ansible, etc to achieve your requirement (for cloud there are another different set of tools like Cloudformation, etc) In high level, you can divide them into two category: Push and Pull based a. Tools like Puppet and Chef are pull based. Agent/Client on the server periodically checks for the configuration information from central server(master) b. Ansible is Push based. Central server pushes the configuration information on target servers. You control when the changes are made on the servers
... View more
09-06-2018
07:57 PM
1 Kudo
There are a few cons to raising your block size: - Increased cost of recovery during write failures When a client is writing a new block into the DataNode pipeline and one of the DataNode fails, there is a enabled-by-default recovery feature that will attempt to refill the gap in the replicated pipeline by transferring the partially written block from one of the remaining good DataNodes to a new DataNode. When this happens, the client is blocked (the outstream,write(…) caller is blocked in the API code). With increased block size, the time waited will also increase greatly depending on how much of the partial block data was written before the failure occurred. A worst-case wait example would involve the time required for network-copying 1.99 GiB for a 2 GiB block size because an involved DN may have failed at that specific point. - Cost of replication caused by DataNode loss or decommission When a DataNode is lost or is being decommissioned, the system has to react by re-filling the gaps in replica counts it creates. With smaller block sizes this activity is easy to spread randomly across the cluster, as several different nodes overall can take part in the re-replicate process. With larger blocks, only a few DNs can participate, and another consequence could be more lopsided space usage across DNs. That said, use of 1-2 GiB is not unheard of and I've seen a few large clusters apply that as their default block size. Its just worth being aware of the cons, looking out for such impact and tuning accordingly as you go. HDFS certainly functions at its best for large sized files, and your usage seems in accordance with that.
... View more
07-09-2018
02:46 AM
The only issue I can see is the way you are writing request_pool, it should be all Caps else it will not set to the specific pool. Kinldy use impala-shell -k -i hostname:portnum -B -q 'set REQUEST_POOL=pool_name;' Hope this will resolve the issue. NOTE: I am considering that the Impala Admission Control is applied and the poolname is the one that is created using Cloudera Manager.
... View more
06-20-2018
01:48 PM
Fawze You said that any role can be moved. How is that done for ones like "HDFS Balancer" or JobHistory servers? These roles can't be added/moved via CM that I can see.
... View more
06-09-2018
03:00 AM
While running a wordcount program i am getting the following error. cloudera@localhost ~]$ hadoop jar WordCount.jar WordCount /inputnew2/inputfile.txt /output_new 18/06/09 00:29:06 INFO ipc.Client: Retrying connect to server: localhost.localdomain/127.0.0.1:8021. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 18/06/09 00:29:07 INFO ipc.Client: Retrying connect to server: localhost.localdomain/127.0.0.1:8021. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 18/06/09 00:29:08 INFO ipc.Client: Retrying connect to server: localhost.localdomain/127.0.0.1:8021. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 18/06/09 00:29:09 INFO ipc.Client: Retrying connect to server: localhost.localdomain/127.0.0.1:8021. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 18/06/09 00:29:10 INFO ipc.Client: Retrying connect to server: localhost.localdomain/127.0.0.1:8021. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 18/06/09 00:29:11 INFO ipc.Client: Retrying connect to server: localhost.localdomain/127.0.0.1:8021. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 18/06/09 00:29:12 INFO ipc.Client: Retrying connect to server: localhost.localdomain/127.0.0.1:8021. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 18/06/09 00:29:13 INFO ipc.Client: Retrying connect to server: localhost.localdomain/127.0.0.1:8021. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 18/06/09 00:29:14 INFO ipc.Client: Retrying connect to server: localhost.localdomain/127.0.0.1:8021. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 18/06/09 00:29:15 INFO ipc.Client: Retrying connect to server: localhost.localdomain/127.0.0.1:8021. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 18/06/09 00:29:15 ERROR security.UserGroupInformation: PriviledgedActionException as:cloudera (auth:SIMPLE) cause:java.net.ConnectException: Call From localhost.localdomain/127.0.0.1 to localhost.localdomain:8021 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused Exception in thread "main" java.net.ConnectException: Call From localhost.localdomain/127.0.0.1 to localhost.localdomain:8021 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:782) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:729) at org.apache.hadoop.ipc.Client.call(Client.java:1241) at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:225) at org.apache.hadoop.mapred.$Proxy10.getStagingAreaDir(Unknown Source) at org.apache.hadoop.mapred.JobClient.getStagingAreaDir(JobClient.java:1324) at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:102) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:951) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596) at WordCount.main(WordCount.java:132) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:207) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:528) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:492) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:509) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:603) at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:252) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1290) at org.apache.hadoop.ipc.Client.call(Client.java:1208) ... 18 more
... View more
05-13-2018
07:24 PM
@Fawze thanks for let me know dr-elephant. I will try it 122 GB in the big Data is not considered as too much, and it's depend on the logic you are doing in the map, normally the logic in the reducer which you should cosider increasing its memory. agree with you. But, I just run a simple query. Thanks
... View more