Hey, we are running a Nifi cluster that connects to a Hive database to pull some data. After a period of time we get the below error.
: java.net.SocketException: Broken pipe (Write failed).: java.sql.SQLException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe (Write failed)
Our current solution is to restart the Hive Controller services and this refresh's the connection and the process works again, this is obviously not ideal for a production implementation. We have also set the Kerberos user's timeout to be a large number of days in the hive-site.xml but the issue still persists. We are running Nifi version 1.12.1. Any help is very much appreciated.
Created 12-01-2021 06:30 AM
I was having the same problem, but I couldn't find any properties in the driver documentation that would help.
This error, in my case, occurred to the detriment of the connection with the database breaking due to some reason of infrastructure or service failure.
However, I noticed that it was possible to solve this problem in two ways using the controller service DBCPConnectionPool:
- You can set a generic query in the Validation Query parameter. This way, at each new transaction with the database, the connection will be recreated executing this query. This can be a good alternative, however, if your flow performs many queries, it will slow down the processing slightly.
- You can change the Max Connection Lifetime parameter, set a value other than -1 or 0. This value is the time interval in which your connection will be recreated with the database. For example, if you set 5 min, the connection will be recreated every five minutes.
In my case the problem has been fixed, I hope I have helped.
Created 12-01-2021 06:30 AM
I was having the same problem, but I couldn't find any properties in the driver documentation that would help.
This error, in my case, occurred to the detriment of the connection with the database breaking due to some reason of infrastructure or service failure.
However, I noticed that it was possible to solve this problem in two ways using the controller service DBCPConnectionPool:
- You can set a generic query in the Validation Query parameter. This way, at each new transaction with the database, the connection will be recreated executing this query. This can be a good alternative, however, if your flow performs many queries, it will slow down the processing slightly.
- You can change the Max Connection Lifetime parameter, set a value other than -1 or 0. This value is the time interval in which your connection will be recreated with the database. For example, if you set 5 min, the connection will be recreated every five minutes.
In my case the problem has been fixed, I hope I have helped.