Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

broken pipe error while running a spark job from namenode

avatar
Expert Contributor

While running a spark job i have found the error as mentioned below.

py4j.protocol.Py4JJavaError: An error occurred while calling o42.load. : java.sql.SQLRecoverableException: Io exception: Broken pipe

This is the file sparkrun.txt which i am running as shell from name-node.

spark-err1.txt- this is the error log which i am getting while running the spark job.

stest-py.txt - this is the python file which i have mentioned in my shell file called sparkrun.txt

plz help in this , as i am not able to find any clue in this.

Jay SenSharma

1 ACCEPTED SOLUTION

avatar
Master Mentor

@hardik desai

As the issue is intermittent so this may not be script related issue.

However based on the error pattern it mostly look the driver is trying to create a new connection and the DBMS breaking the socket between it and the driver, at a very early stage of the process. This has to be either a network issue, or more likely a DBMS issue like if there are too many connection requests at once or in a short burst, the DBMS listener process gets overloaded and severs some of the incoming sockets.

- So we will need to check why the connection was not established.

- Due to Load on the DB (Check the DB logs of the same timestamp)

- Due to Load on your machine. ( Check the SAR report to find the historical data of the OS of the mentioned timestamp)

- N/W drops also might be a reason. Check the "/var/log/messages" to see if any thing unusual happened during the time of error.

View solution in original post

4 REPLIES 4

avatar
Master Mentor

@hardik desai

The error seems to be because of logon to oracle DB.

df = sqlContext.read.format("jdbc").option("driver", "oracle.jdbc.OracleDriver").option("url","jdbc:oracle:thin:NE/Network_147@10.77.1.147:1521/ELLDEV").option("dbtable","NE.INTER_APP_EVENT").load()

py4j.protocol.Py4JJavaError: An error occurred while calling o42.load.
: java.sql.SQLRecoverableException: Io exception: Broken pipe
        at oracle.jdbc.driver.SQLStateMapping.newSQLException(SQLStateMapping.java:101)
        at oracle.jdbc.driver.DatabaseError.newSQLException(DatabaseError.java:133)
        at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:199)
        at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:263)
        at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:521)
        at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:418)
        at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:508)
        at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:203)

.

You should check few things :

1. Using some sqlplus or oracle client tool are you able to connect to the mentioned URL?

jdbc:oracle:thin:NE/Network_147@10.77.1.147:1521/ELLDEV

2. Try doing telnet from the machine where you are executing the script to see port access.

telnet   10.77.1.147   1521

3. Broken Pipe error usually indicates a broken communication. Mostly due to abrupt termination of connection from the other end or N/W issue.

4. Check if the Oracle credentials that you are using are correct not expired/blocked.

5. Can you try passing the credentials as well using "user" and "password" options? Instead of passing it via the URL of the DB

options.put("user", "USER").options.put("password", "PASS")

.

In General:

The driver is trying to create a new connection and the DBMS breaking the socket between it and the driver, at a very early stage of the process. This has to be either a network issue, or more likely a DBMS issue like if there are too many connection requests at once or in a short burst, the DBMS listener process gets overloaded and severs some of the incoming sockets.

avatar
Expert Contributor

Jay SenSharma, thanks for the reply...

I am able to telnet as you mentioned. Trying to fetch data from oracle DB to HDFS using the script mentioned here. Also note that this is running every hour and i am able to run it successfully in few attempts( 4-5 times a day ) but all the other times its got failed with the erorr given here.

Can you plz provide any more troubleshooting help if possible?

thanks ....

avatar
Master Mentor

@hardik desai

As the issue is intermittent so this may not be script related issue.

However based on the error pattern it mostly look the driver is trying to create a new connection and the DBMS breaking the socket between it and the driver, at a very early stage of the process. This has to be either a network issue, or more likely a DBMS issue like if there are too many connection requests at once or in a short burst, the DBMS listener process gets overloaded and severs some of the incoming sockets.

- So we will need to check why the connection was not established.

- Due to Load on the DB (Check the DB logs of the same timestamp)

- Due to Load on your machine. ( Check the SAR report to find the historical data of the OS of the mentioned timestamp)

- N/W drops also might be a reason. Check the "/var/log/messages" to see if any thing unusual happened during the time of error.

avatar
Expert Contributor

Jay SenSharma thanks, i will check the logs to find out why it is dropping the connections and let you know...