- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
broken pipe error while running a spark job from namenode
- Labels:
-
Apache Spark
Created ‎06-05-2017 06:53 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
While running a spark job i have found the error as mentioned below.
py4j.protocol.Py4JJavaError: An error occurred while calling o42.load. : java.sql.SQLRecoverableException: Io exception: Broken pipe
This is the file sparkrun.txt which i am running as shell from name-node.
spark-err1.txt- this is the error log which i am getting while running the spark job.
stest-py.txt - this is the python file which i have mentioned in my shell file called sparkrun.txt
plz help in this , as i am not able to find any clue in this.
Created ‎06-08-2017 05:30 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As the issue is intermittent so this may not be script related issue.
However based on the error pattern it mostly look the driver is trying to create a new connection and the DBMS breaking the socket between it and the driver, at a very early stage of the process. This has to be either a network issue, or more likely a DBMS issue like if there are too many connection requests at once or in a short burst, the DBMS listener process gets overloaded and severs some of the incoming sockets.
- So we will need to check why the connection was not established.
- Due to Load on the DB (Check the DB logs of the same timestamp)
- Due to Load on your machine. ( Check the SAR report to find the historical data of the OS of the mentioned timestamp)
- N/W drops also might be a reason. Check the "/var/log/messages" to see if any thing unusual happened during the time of error.
Created ‎06-08-2017 05:15 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The error seems to be because of logon to oracle DB.
df = sqlContext.read.format("jdbc").option("driver", "oracle.jdbc.OracleDriver").option("url","jdbc:oracle:thin:NE/Network_147@10.77.1.147:1521/ELLDEV").option("dbtable","NE.INTER_APP_EVENT").load() py4j.protocol.Py4JJavaError: An error occurred while calling o42.load. : java.sql.SQLRecoverableException: Io exception: Broken pipe at oracle.jdbc.driver.SQLStateMapping.newSQLException(SQLStateMapping.java:101) at oracle.jdbc.driver.DatabaseError.newSQLException(DatabaseError.java:133) at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:199) at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:263) at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:521) at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:418) at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:508) at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:203)
.
You should check few things :
1. Using some sqlplus or oracle client tool are you able to connect to the mentioned URL?
jdbc:oracle:thin:NE/Network_147@10.77.1.147:1521/ELLDEV
2. Try doing telnet from the machine where you are executing the script to see port access.
telnet 10.77.1.147 1521
3. Broken Pipe error usually indicates a broken communication. Mostly due to abrupt termination of connection from the other end or N/W issue.
4. Check if the Oracle credentials that you are using are correct not expired/blocked.
5. Can you try passing the credentials as well using "user" and "password" options? Instead of passing it via the URL of the DB
options.put("user", "USER").options.put("password", "PASS")
.
In General:
The driver is trying to create a new connection and the DBMS breaking the socket between it and the driver, at a very early stage of the process. This has to be either a network issue, or more likely a DBMS issue like if there are too many connection requests at once or in a short burst, the DBMS listener process gets overloaded and severs some of the incoming sockets.
Created ‎06-08-2017 05:24 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jay SenSharma, thanks for the reply...
I am able to telnet as you mentioned. Trying to fetch data from oracle DB to HDFS using the script mentioned here. Also note that this is running every hour and i am able to run it successfully in few attempts( 4-5 times a day ) but all the other times its got failed with the erorr given here.
Can you plz provide any more troubleshooting help if possible?
thanks ....
Created ‎06-08-2017 05:30 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As the issue is intermittent so this may not be script related issue.
However based on the error pattern it mostly look the driver is trying to create a new connection and the DBMS breaking the socket between it and the driver, at a very early stage of the process. This has to be either a network issue, or more likely a DBMS issue like if there are too many connection requests at once or in a short burst, the DBMS listener process gets overloaded and severs some of the incoming sockets.
- So we will need to check why the connection was not established.
- Due to Load on the DB (Check the DB logs of the same timestamp)
- Due to Load on your machine. ( Check the SAR report to find the historical data of the OS of the mentioned timestamp)
- N/W drops also might be a reason. Check the "/var/log/messages" to see if any thing unusual happened during the time of error.
Created ‎06-08-2017 05:42 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jay SenSharma thanks, i will check the logs to find out why it is dropping the connections and let you know...
