- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on 01-19-2017 08:40 AM
DESCRIPTION:
Received frequent alerts for connection timeout with journal node. Upon checking the connectivity, it results in below output.
curl -v http://123.example.com:8480--max-time 4 | tail -4 * About to connect() to 123.example.com:8480 port 8480 (#0) * Trying 10.24.16.11... connected * Connected to 123.example.com (10.24.16.11) port 8480 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.18 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2 > Host: 123.example.com:8480 > Accept: */* > % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- 0:00:04 --:--:-- 0* Operation timed out after 4000 milliseconds with 0 bytes received 0 0 0 0 0 0 0 0 --:--:-- 0:00:04 --:--:-- 0* Closing connection #0 curl: (28) Operation timed out after 4000 milliseconds with 0 bytes received
Checking netstat command further for port 8480 gives us huge number of CLOSE_WAIT messages.
[root@123 ~]# netstat -putane | grep -i 8480 tcp 0 0 0.0.0.0:8480 0.0.0.0:* LISTEN 72383 1586576877 1719/java tcp 1 0 10.24.16.11:8480 10.24.17.11:46572 CLOSE_WAIT 72383 1587407492 1719/java tcp 1 0 10.24.16.11:8480 10.24.17.11:57944 CLOSE_WAIT 72383 1586744345 1719/java tcp 1 0 10.24.16.11:8480 10.24.17.11:57462 CLOSE_WAIT 72383 1586708412 1719/java
Check the meaning of CLOSE_WAIT here link
ROOT CAUSE:
It was found that an edits_in_progress file was stuck as an orphan file since last two months while the edits recorded in it are already captured in other completed edits file. Due to this, the port 8480 of the respective Journal node process was coming up in CLOSE_WAIT as the socket is not closed properly.
SOLUTION :
Removed the orphan edits_in_progress file and restarted journal nodes.
Created on 06-20-2018 04:39 PM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
This worked.