Member since
06-28-2017
12
Posts
1
Kudos Received
0
Solutions
07-16-2018
10:40 AM
If I execute following code spark.read.format("org.apache.phoenix.spark") \
.option("table", "data_table") \
.option("zkUrl", zkUrl) \
.load().createOrReplaceTempView("table")
spark.sql("select * from table where date>='2018-07-02 00:00:00' and date<'2018-07-04 00:00:00'").createTempView("c")
spark.sql("select * from c").explain(True) Then I get following explanation == Parsed Logical Plan ==
'Project [*]
+- 'UnresolvedRelation `c`
== Analyzed Logical Plan ==
DATE: timestamp, ID: string, SESSIONID: string, IP: string, NAME: string, BYTES_SENT: int, DELTA: int, W_ID: string
Project [DATE#48, ID#49, SESSIONID#50, IP#51, NAME#52, BYTES_SENT#53, DELTA#54, W_ID#55]
+- SubqueryAlias c
+- Project [DATE#48, ID#49, SESSIONID#50, IP#51, NAME#52, BYTES_SENT#53, DELTA#54, W_ID#55]
+- Filter ((cast(date#48 as string) >= 2018-07-02 00:00:00) && (cast(date#48 as string) < 2018-07-04 00:00:00))
+- SubqueryAlias table
+- Relation[DATE#48,ID#49,SESSIONID#50,IP#51,NAME#52,BYTES_SENT#53,DELTA#54,W_ID#55] PhoenixRelation(data_table,10.10.5.20,10.10.5.21,10.10.5.22,10.10.5.23:2181,false)
== Optimized Logical Plan ==
Filter ((isnotnull(date#48) && (cast(date#48 as string) >= 2018-07-02 00:00:00)) && (cast(date#48 as string) < 2018-07-04 00:00:00))
+- Relation[DATE#48,ID#49,SESSIONID#50,IP#51,NAME#52,BYTES_SENT#53,DELTA#54,W_ID#55] PhoenixRelation(data_table,10.10.5.20,10.10.5.21,10.10.5.22,10.10.5.23:2181,false)
== Physical Plan ==
*(1) Filter (((cast(DATE#48 as string) >= 2018-07-02 00:00:00) && (cast(DATE#48 as string) < 2018-07-04 00:00:00)) && isnotnull(DATE#48))
+- *(1) Scan PhoenixRelation(data_table,10.10.5.20,10.10.5.21,10.10.5.22,10.10.5.23:2181,false) [DATE#48,ID#49,SESSIONID#50,IP#51,NAME#52,BYTES_SENT#53,DELTA#54,W_ID#55] PushedFilters: [IsNotNull(DATE)], ReadSchema: struct<DATE:timestamp,ID:string,SESSIONID:string,IP:string,NAME:string,BYTES_SENT:int,DELTA:int,W... In this explanation does PhoenixRelation(data_table,10.10.5.20,10.10.5.21,10.10.5.22,10.10.5.23:2181,false) mean that it is sending request to phoenix table ? If that is the case, then it is understandable since we are selecting all data from table which sends request to phoenix. What confuses me is following code spark.sql("select * from c where date>='2018-07-03 00:00:00' and date<'2018-07-04 00:00:00'").createTempView("d")
spark.sql("select * from d").explain(True) which gives me following explanation == Parsed Logical Plan ==
'Project [*]
+- 'UnresolvedRelation `d`
== Analyzed Logical Plan ==
DATE: timestamp, ID: string, SESSIONID: string, IP: string, NAME: string, BYTES_SENT: int, DELTA: int, W_ID: string
Project [DATE#48, ID#49, SESSIONID#50, IP#51, NAME#52, BYTES_SENT#53, DELTA#54, W_ID#55]
+- SubqueryAlias d
+- Project [DATE#48, ID#49, SESSIONID#50, IP#51, NAME#52, BYTES_SENT#53, DELTA#54, W_ID#55]
+- Filter ((cast(date#48 as string) >= 2018-07-03 00:00:00) && (cast(date#48 as string) < 2018-07-04 00:00:00))
+- SubqueryAlias c
+- Project [DATE#48, ID#49, SESSIONID#50, IP#51, NAME#52, BYTES_SENT#53, DELTA#54, W_ID#55]
+- Filter ((cast(date#48 as string) >= 2018-07-02 00:00:00) && (cast(date#48 as string) < 2018-07-04 00:00:00))
+- SubqueryAlias table
+- Relation[DATE#48,ID#49,SESSIONID#50,IP#51,NAME#52,BYTES_SENT#53,DELTA#54,W_ID#55] PhoenixRelation(data_table,10.10.5.20,10.10.5.21,10.10.5.22,10.10.5.23:2181,false)
== Optimized Logical Plan ==
Filter (((isnotnull(date#48) && (cast(date#48 as string) >= 2018-07-02 00:00:00)) && (cast(date#48 as string) < 2018-07-04 00:00:00)) && (cast(date#48 as string) >= 2018-07-03 00:00:00))
+- Relation[DATE#48,ID#49,SESSIONID#50,IP#51,NAME#52,BYTES_SENT#53,DELTA#54,W_ID#55] PhoenixRelation(data_table,10.10.5.20,10.10.5.21,10.10.5.22,10.10.5.23:2181,false)
== Physical Plan ==
*(1) Filter ((((cast(DATE#48 as string) >= 2018-07-02 00:00:00) && (cast(DATE#48 as string) < 2018-07-04 00:00:00)) && (cast(DATE#48 as string) >= 2018-07-03 00:00:00)) && isnotnull(DATE#48))
+- *(1) Scan PhoenixRelation(data_table,10.10.5.20,10.10.5.21,10.10.5.22,10.10.5.23:2181,false) [DATE#48,ID#49,SESSIONID#50,IP#51,NAME#52,BYTES_SENT#53,DELTA#54,W_ID#55] PushedFilters: [IsNotNull(DATE)], ReadSchema: struct<DATE:timestamp,ID:string,SESSIONID:string,IP:string,NAME:string,BYTES_SENT:int,DELTA:int,W... Here again we can see PhoenixRelation(data_table,10.10.5.20,10.10.5.21,10.10.5.22,10.10.5.23:2181,false) Is it again trying to connect to phoenix table? Did it not load data the first time. What I thought was it will load the data and store the data in dataframe which if I am not mistaken is a memory table, and if I want to fetch data from another dataframe then why is spark trying to connect to phoenix. I have a problem which requires some calculation over a dataframe and I need to do it for 1k loops, if I load the data from a CSV file , the whole operation takes about 2 secs but when I try to fetch data from phoenix table it take lots of time. I did explain on that loop too , and each time I get the same phoenix relation as above. I must be doing something wrong here, but I am unable to figure it out. Is this the expected behavior of spark_phoenix ?
... View more
Labels:
- Labels:
-
Apache Phoenix
-
Apache Spark
08-18-2017
11:05 AM
Is it possible to create partition like 01 from date like 2017-01-02' where 01 is month ? I have daily sales record and I need to do query like select * from sales where month = '01' . So it will be better if I could partition my daily sales by month.but my data has date of format 2017-01-01 and doing create table tl (columns ......) partitioned by (date <datatype> ) will create partition on daily basis which is the last thing I want . I need to create partition dynamically.
... View more
Labels:
- Labels:
-
Apache Hive
08-03-2017
11:14 AM
When I run the code String s = in.readLine();
System.out.println("readline -------- "+s+"\n");
flatJson = JSONFlattener.parseJson(s);
if (hdfs.exists(new Path(hdfsUri + dirName + "/" + filename))) {
outStream = hdfs.append(new Path(hdfsUri + dirName + "/" + filename));
BufferedWriter br = new BufferedWriter(new OutputStreamWriter(outStream));
br.append(CSVWriter.getCSV(flatJson));
br.close();
} else {
outStream = hdfs.create(new Path(hdfsUri + dirName + "/" + filename));
IOUtils.write((CharSequence) CSVWriter.getCSV(flatJson), outStream);
}
I get org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.HadoopIllegalArgumentException): Missing storageIDs: It is likely that the HDFS client, who made this call, is running in an older version of Hadoop which does not support storageIDs. datanodeID.length=1, src=/ekbana2/text.csv, oldBlock=BP-1259408151-10.10.10.235-1501128132270:blk_1073741868_1057, newBlock=BP-1259408151-10.10.10.235-1501128132270:blk_1073741868_1058, clientName=DFSClient_NONMAPREDUCE_-615707839_1
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:526)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipelineInternal(FSNamesystem.java:5608)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipeline(FSNamesystem.java:5572)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updatePipeline(NameNodeRpcServer.java:917)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updatePipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:986)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:788)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2455)
at org.apache.hadoop.ipc.Client.call(Client.java:1347)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy7.updatePipeline(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy7.updatePipeline(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.updatePipeline(ClientNamenodeProtocolTranslatorPB.java:791)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1047)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:520)
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.HadoopIllegalArgumentException): Missing storageIDs: It is likely that the HDFS client, who made this call, is running in an older version of Hadoop which does not support storageIDs. datanodeID.length=1, src=/ekbana2/text.csv, oldBlock=BP-1259408151-10.10.10.235-1501128132270:blk_1073741868_1057, newBlock=BP-1259408151-10.10.10.235-1501128132270:blk_1073741868_1058, clientName=DFSClient_NONMAPREDUCE_-615707839_1
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:526)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipelineInternal(FSNamesystem.java:5608)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipeline(FSNamesystem.java:5572)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updatePipeline(NameNodeRpcServer.java:917)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updatePipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:986)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:788)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2455)
at org.apache.hadoop.ipc.Client.call(Client.java:1347)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy7.updatePipeline(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy7.updatePipeline(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.updatePipeline(ClientNamenodeProtocolTranslatorPB.java:791)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1047)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:520)
What is that exception missing storage id . I have two node cluster and both of them has hadoop-2.8.0. Why is that error then my hadoop setup hdfs = FileSystem.get(URI.create(hdfsUri), con);
hdfs.mkdirs(new Path(hdfsUri + "/" + dirName));
hdfs.setReplication(new Path(hdfsUri + "/" + dirName + "/" + filename), (short) 1); I am fetching data from a url and writing to hdfs and I am fetching on daily basis and since perday data is small I thought it would be good to make single large file by appending rather than writing to small individual files.But appending is not working.I have set replication factor to 1.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Cloudera DataFlow (CDF)
08-02-2017
08:44 AM
@Jay SenSharma The hadoop-common jar is present in "WEB-INF/lib" . Still I am getting this error
... View more
08-02-2017
08:14 AM
I wanted to access hive table from servlet. when I run the url I get javax.servlet.ServletException:Servlet execution threw an exception
org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)RootCause
java.lang.NoClassDefFoundError:Couldnot initialize class org.apache.hadoop.security.UserGroupInformation
org.apache.spark.util.Utils$anonfun$getCurrentUserName$1.apply(Utils.scala:2391)
org.apache.spark.util.Utils$anonfun$getCurrentUserName$1.apply(Utils.scala:2391)
scala.Option.getOrElse(Option.scala:121)
org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2391)
org.apache.spark.SparkContext.<init>(SparkContext.scala:295)
org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2320)
org.apache.spark.sql.SparkSession$Builder$anonfun$6.apply(SparkSession.scala:868)
org.apache.spark.sql.SparkSession$Builder$anonfun$6.apply(SparkSession.scala:860)
scala.Option.getOrElse(Option.scala:121)
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)HiveRead.doGet(HiveRead.java:30)
javax.servlet.http.HttpServlet.service(HttpServlet.java:635)
javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) I am running this code SparkSession spark =SparkSession.builder().appName("Java Spark SQL basic example").enableHiveSupport().config("spark.sql.warehouse.dir","hdfs://saurab:9000/user/hive/warehouse").config("mapred.input.dir.recursive",true).config("hive.mapred.supports.subdirectories",true).config("hive.vectorized.execution.enabled",true).master("local").getOrCreate();
response.getWriter().println("olo"); I looked at this question and just to test added export SPARK_CLASSPATH=$CLASS_PATH:/home/saurab/hadoopec/spark/jars/hadoop-auth-2.7.3 since I was not using shark Though my aim is to access hive table and run sql using Spark, I am running this code to test. This code has nothing to do with hive table but If I run only this much code I get above exception. I guess it has something to do with authentication, but I can't figure out what. this is my pom.xml IS there a better way to access hive tables and run queries using servlets?
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark
07-12-2017
08:23 AM
I encountered a weired behaviour of hive jdbc. when I create table like create table my_table (....other good stuffs..) it gets create inside database named default but when I do create table mydb.mytable (..stuffss..) it gets created inside mydb. I am using spark and hive . Previously I would do like Connection con = DriverManager.getConnection("jdbc:hive2://localhost:10000/", "hiveuser", "hivepassword");
Statement stmt = con.createStatement(); Here I could specify DriverManager.getConnection("jdbc:hive2://localhost:10000/mydb", "hiveuser", "hivepassword"); but now I am using spark , so I am doing SparkSession spark = SparkSession
.builder()
.appName("Java Spark SQL basic example")
.enableHiveSupport()
.config("spark.sql.warehouse.dir", "hdfs://saurab:9000/user/hive/warehouse")
.config("hive.metastore.warehouse.dir", "hdfs://saurab:9000/user/hive/warehouse")
.master("local")
.getOrCreate(); I see no configuration to specify database name. That's why I need to explicitly address database if I want to do any CRUD So how does spark connect to hive ?
... View more
Labels:
- Labels:
-
Apache Hive
07-11-2017
07:48 AM
@nkumar I am using HDFS. Also I tried setting hive.execution.engine to spark, and I got exact same error. Now I am thinking the problem is with yarn , but I can't find what it is
... View more
07-10-2017
08:03 AM
@Sindhu I deleted nm-local-dir and restarted node-manager, but error still persists.
(The issue seems to be on specific Node manager) Where did you get that idea from. I have been looking at the log since 1 hour, damn I must have missed something important.This line only says exit and the line above it says launching container. org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1499673613607_0001_02_000001 is : 1
... View more
07-10-2017
07:40 AM
I wanted to run hive queries through jdbc, but I am getting
<code>java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
Then I looked nodemanager log. Here are some key notes to consider
<code>1)Container container_1499666177243_0001_02_000001 transitioned from RUNNING to EXITED_WITH_FAILURERESULT=FAILURE
2)DESCRIPTION=Container failed with state: EXITED_WITH_FAILURE
And here is complete stack trace
<code>2017-07-10 11:41:34,149 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception from container-launch with container ID: container_1499666177243_0001_02_000001 and exit code: 1
ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:972)
at org.apache.hadoop.util.Shell.run(Shell.java:869)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:236)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:305)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:84)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
2017-07-10 11:41:34,152 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception from container-launch.
2017-07-10 11:41:34,152 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Container id: container_1499666177243_0001_02_000001
2017-07-10 11:41:34,152 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exit code: 1
2017-07-10 11:41:34,152 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Stack trace: ExitCodeException exitCode=1:
2017-07-10 11:41:34,152 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell.runCommand(Shell.java:972)
2017-07-10 11:41:34,152 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell.run(Shell.java:869)
2017-07-10 11:41:34,152 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
2017-07-10 11:41:34,152 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:236)
2017-07-10 11:41:34,152 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:305)
2017-07-10 11:41:34,152 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:84)
2017-07-10 11:41:34,152 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.FutureTask.run(FutureTask.java:266)
2017-07-10 11:41:34,152 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
2017-07-10 11:41:34,153 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
2017-07-10 11:41:34,153 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.lang.Thread.run(Thread.java:748)
2017-07-10 11:41:34,153 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Container exited with a non-zero exit code 1
2017-07-10 11:41:34,156 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1499666177243_0001_02_000001 transitioned from RUNNING to EXITED_WITH_FAILURE
2017-07-10 11:41:34,156 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1499666177243_0001_02_000001
2017-07-10 11:41:34,199 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=saurab OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container failed with state: EXITED_WITH_FAILURE APPID=application_1499666177243_0001 CONTAINERID=container_1499666177243_0001_02_000001
2017-07-10 11:41:34,200 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /home/saurab/hadoopec/hadoop/tmp/hadoop-tmp-dir/nm-local-dir/usercache/saurab/appcache/application_1499666177243_0001/container_1499666177243_0001_02_000001
2017-07-10 11:41:34,202 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1499666177243_0001_02_000001 transitioned from EXITED_WITH_FAILURE to DONE
2017-07-10 11:41:34,203 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Removing container_1499666177243_0001_02_000001 from application application_1499666177243_0001
2017-07-10 11:41:34,204 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Stopping resource-monitoring for container_1499666177243_0001_02_000001
2017-07-10 11:41:34,204 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1499666177243_0001
2017-07-10 11:41:35,208 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed completed containers from NM context: [container_1499666177243_0001_02_000001]
2017-07-10 11:41:35,209 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Application application_1499666177243_0001 transitioned from RUNNING to APPLICATION_RESOURCES_CLEANINGUP
2017-07-10 11:41:35,210 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /home/saurab/hadoopec/hadoop/tmp/hadoop-tmp-dir/nm-local-dir/usercache/saurab/appcache/application_1499666177243_0001
2017-07-10 11:41:35,210 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_STOP for appId application_1499666177243_0001
2017-07-10 11:41:35,211 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Application application_1499666177243_0001 transitioned from APPLICATION_RESOURCES_CLEANINGUP to FINISHED
2017-07-10 11:41:35,211 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler: Scheduling Log Deletion for application: application_1499666177243_0001, with delay of 10800 seconds
2017-07-10 11:43:26,431 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1499666177243_0002_000002 (auth:SIMPLE)
2017-07-10 11:43:26,438 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1499666177243_0002_02_000001 by user saurab
2017-07-10 11:43:26,438 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Creating a new application reference for app application_1499666177243_0002
2017-07-10 11:43:26,439 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=saurab IP=10.10.10.149 OPERATION=Start Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1499666177243_0002 CONTAINERID=container_1499666177243_0002_02_000001
2017-07-10 11:43:26,440 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Application application_1499666177243_0002 transitioned from NEW to INITING
2017-07-10 11:43:26,440 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Adding container_1499666177243_0002_02_000001 to application application_1499666177243_0002
2017-07-10 11:43:26,440 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Application application_1499666177243_0002 transitioned from INITING to RUNNING
2017-07-10 11:43:26,441 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1499666177243_0002_02_000001 transitioned from NEW to LOCALIZING
2017-07-10 11:43:26,441 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1499666177243_0002
2017-07-10 11:43:26,441 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_INIT for appId application_1499666177243_0002
2017-07-10 11:43:26,442 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got APPLICATION_INIT for service mapreduce_shuffle
2017-07-10 11:43:26,442 INFO org.apache.hadoop.mapred.ShuffleHandler: Added token for job_1499666177243_0002
2017-07-10 11:43:26,444 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://saurab:9000/tmp/hive/saurab/_tez_session_dir/fed51831-bf68-45b0-abea-11fb2b007c2f/.tez/application_1499666177243_0002/tez-conf.pb transitioned from INIT to DOWNLOADING
2017-07-10 11:43:26,444 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://saurab:9000/tmp/hive/saurab/_tez_session_dir/fed51831-bf68-45b0-abea-11fb2b007c2f/.tez/application_1499666177243_0002/tez.session.local-resources.pb transitioned from INIT to DOWNLOADING
2017-07-10 11:43:26,446 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Created localizer for container_1499666177243_0002_02_000001
2017-07-10 11:43:26,448 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Writing credentials to the nmPrivate file /home/saurab/hadoopec/hadoop/tmp/hadoop-tmp-dir/nm-local-dir/nmPrivate/container_1499666177243_0002_02_000001.tokens. Credentials list:
2017-07-10 11:43:26,449 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Initializing user saurab
2017-07-10 11:43:26,450 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying from /home/saurab/hadoopec/hadoop/tmp/hadoop-tmp-dir/nm-local-dir/nmPrivate/container_1499666177243_0002_02_000001.tokens to /home/saurab/hadoopec/hadoop/tmp/hadoop-tmp-dir/nm-local-dir/usercache/saurab/appcache/application_1499666177243_0002/container_1499666177243_0002_02_000001.tokens
2017-07-10 11:43:26,450 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Localizer CWD set to /home/saurab/hadoopec/hadoop/tmp/hadoop-tmp-dir/nm-local-dir/usercache/saurab/appcache/application_1499666177243_0002 = file:/home/saurab/hadoopec/hadoop/tmp/hadoop-tmp-dir/nm-local-dir/usercache/saurab/appcache/application_1499666177243_0002
2017-07-10 11:43:26,643 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://saurab:9000/tmp/hive/saurab/_tez_session_dir/fed51831-bf68-45b0-abea-11fb2b007c2f/.tez/application_1499666177243_0002/tez-conf.pb(->/home/saurab/hadoopec/hadoop/tmp/hadoop-tmp-dir/nm-local-dir/usercache/saurab/appcache/application_1499666177243_0002/filecache/10/tez-conf.pb) transitioned from DOWNLOADING to LOCALIZED
2017-07-10 11:43:26,675 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://saurab:9000/tmp/hive/saurab/_tez_session_dir/fed51831-bf68-45b0-abea-11fb2b007c2f/.tez/application_1499666177243_0002/tez.session.local-resources.pb(->/home/saurab/hadoopec/hadoop/tmp/hadoop-tmp-dir/nm-local-dir/usercache/saurab/appcache/application_1499666177243_0002/filecache/11/tez.session.local-resources.pb) transitioned from DOWNLOADING to LOCALIZED
2017-07-10 11:43:26,676 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1499666177243_0002_02_000001 transitioned from LOCALIZING to LOCALIZED
2017-07-10 11:43:26,715 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1499666177243_0002_02_000001 transitioned from LOCALIZED to RUNNING
2017-07-10 11:43:26,715 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Starting resource-monitoring for container_1499666177243_0002_02_000001
2017-07-10 11:43:26,718 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [nice, -n, 0, bash, /home/saurab/hadoopec/hadoop/tmp/hadoop-tmp-dir/nm-local-dir/usercache/saurab/appcache/application_1499666177243_0002/container_1499666177243_0002_02_000001/default_container_executor.sh]
2017-07-10 11:43:26,868 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1499666177243_0002_02_000001 is : 1
2017-07-10 11:43:26,868 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception from container-launch with container ID: container_1499666177243_0002_02_000001 and exit code: 1
ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:972)
at org.apache.hadoop.util.Shell.run(Shell.java:869)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:236)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:305)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:84)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception from container-launch.
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Container id: container_1499666177243_0002_02_000001
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exit code: 1
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Stack trace: ExitCodeException exitCode=1:
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell.runCommand(Shell.java:972)
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell.run(Shell.java:869)
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:236)
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:305)
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:84)
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.FutureTask.run(FutureTask.java:266)
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.lang.Thread.run(Thread.java:748)
2017-07-10 11:43:26,868 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Container exited with a non-zero exit code 1
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1499666177243_0002_02_000001 transitioned from RUNNING to EXITED_WITH_FAILURE
2017-07-10 11:43:26,868 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1499666177243_0002_02_000001
2017-07-10 11:43:26,898 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /home/saurab/hadoopec/hadoop/tmp/hadoop-tmp-dir/nm-local-dir/usercache/saurab/appcache/application_1499666177243_0002/container_1499666177243_0002_02_000001
2017-07-10 11:43:26,899 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=saurab OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container failed with state: EXITED_WITH_FAILURE APPID=application_1499666177243_0002 CONTAINERID=container_1499666177243_0002_02_000001
2017-07-10 11:43:26,900 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1499666177243_0002_02_000001 transitioned from EXITED_WITH_FAILURE to DONE
2017-07-10 11:43:26,900 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Removing container_1499666177243_0002_02_000001 from application application_1499666177243_0002
2017-07-10 11:43:26,900 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Stopping resource-monitoring for container_1499666177243_0002_02_000001
2017-07-10 11:43:26,900 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1499666177243_0002
2017-07-10 11:43:27,904 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed completed containers from NM context: [container_1499666177243_0002_02_000001]
2017-07-10 11:43:27,905 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Application application_1499666177243_0002 transitioned from RUNNING to APPLICATION_RESOURCES_CLEANINGUP
2017-07-10 11:43:27,905 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /home/saurab/hadoopec/hadoop/tmp/hadoop-tmp-dir/nm-local-dir/usercache/saurab/appcache/application_1499666177243_0002
2017-07-10 11:43:27,905 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_STOP for appId application_1499666177243_0002
2017-07-10 11:43:27,905 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Application application_1499666177243_0002 transitioned from APPLICATION_RESOURCES_CLEANINGUP to FINISHED
2017-07-10 11:43:27,905 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler: Scheduling Log Deletion for application: application_1499666177243_0002, with delay of 10800 seconds
Surprisingly this error only comes up when
I SET hive.execution.engine=tez;, It works fine with SET hive.execution.engine=mr
... View more
Labels:
- Labels:
-
Apache YARN
06-29-2017
02:01 PM
1 Kudo
I was facing failed to replace bad datanode error while appending new data to file and the work around was to set dfs.replication
to less than 3 , so I set it to 1 just to test it. But I still got the
same error. I looked at the hadoop web interface and surprisingly the
replication factor was still 3 . but when I did hdfs dfs -setrep 1 <file_name> the replication is set to 1 and I could append to file. Why is this happening? Can I not set default replication factor ?
I tried formating namenode still no change.
Here's my hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
</configuration>
I tried to follow steps from this question, still my replication factor is 3.
I am running hadoop in single node cluster.
... View more
Labels:
- Labels:
-
Apache Hadoop