Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Impala daemon fail to start on just on datanode - /var/run/hdfs-sockets/dn does not exist

Impala daemon fail to start on just on datanode - /var/run/hdfs-sockets/dn does not exist

Explorer

Hi,

 

I have a fresh install of 10 machines on which 2 masters and 8 datanodes.

 

One of datanodes did not start impala daemon as shown in log below:

 

@ERROR: short-circuit local reads is disabled because
  - Impala cannot read or execute the parent directory of dfs.domain.socket.path

    @           0x78ce83  (unknown)
    @           0x98812b  (unknown)
    @           0x9a53bc  (unknown)
    @           0x9a6ad1  (unknown)
    @           0x769d0b  (unknown)
    @     0x7f189c65ed5d  __libc_start_main
    @           0x769989  (unknown)


ERROR: short-circuit local reads is disabled because
  - Impala cannot read or execute the parent directory of dfs.domain.socket.path

So I have checked out all others datanodes and realized that just this datanode does not have /var/run/hdfs-sockets/dn directory.

 

I've created the directory /var/run/hdfs-sockets as below:

 

[root@bdsecslave0001 run]# ll
total 80
-rw-r--r--  1 root         root            5 Out 15 12:00 auditd.pid
drwxr-xr-x  5 cloudera-scm cloudera-scm 4096 Out 19 08:31 cloudera-scm-agent
-rw-r--r--  1 root         root            5 Out 19 08:31 cloudera-scm-agent.pid
drwxr-xr-x. 2 root         root         4096 Out 15  2014 console
-rw-r--r--  1 root         root            5 Out 15 12:00 crond.pid
----------  1 root         root            0 Out 15 12:00 cron.reboot
drwxr-xr-x. 2 root         root         4096 Out 15  2014 faillock
drwxr-xr-x  2 hdfs         hadoop       4096 Out 21 15:14 hdfs-sockets
drwx--x---. 2 root         apache       4096 Out  9 18:02 httpd
drwx------. 2 root         root         4096 Out 15  2014 lvm
drwx------. 2 root         root         4096 Out 16  2014 mdadm
drwxrwxr-x. 2 root         root         4096 Out 16  2014 netreport
-rw-r--r--  1 root         root            5 Out 21 09:21 ntpd.pid
drwxr-xr-x. 2 root         root         4096 Ago 11  2014 plymouth
drwxr-xr-x. 2 root         root         4096 Mar 25  2015 saslauthd
drwxrwxr-x. 2 root         screen       4096 Out  9 18:15 screen
drwxr-xr-x. 2 root         root         4096 Out 15  2014 sepermit
drwxr-xr-x. 2 root         root         4096 Out 15  2014 setrans
-rw-r--r--  1 root         root            5 Out 19 08:28 sshd.pid
-rw-------  1 root         root            5 Out 15 12:00 syslogd.pid
-rw-rw-r--  1 root         utmp         3840 Out 21 15:07 utmp

However I figured out that the DN inside this directory is a socket? So I'm not able to create it.

 

 

I also tried to remove impala service from the entire cluster and retry to install it and the same error happened.

 

One thing that I can workaround on that is to remove impala daemon instance from this datanode but I'll have one datanode without being recognized by impala meta store. This way I will miss one processing machine.

 

I'd remove the entire machine from the cluster and start from scratch but I'dont know if it's a good method.

 

Any help ?

 

Thanks

 

Fabricio

 

 

 

 

Don't have an account?
Coming from Hortonworks? Activate your account here