Reply
Explorer
Posts: 10
Registered: ‎07-04-2016

data load to hdfs using parquet file

[ Edited ]

Hi,

 

We are trying to load data to hdfs using parquet file taking avro s input file.

 

Cluster is HA enabled and specifying nameservice name for hdfs.But the hdfs write call goes to standby namenode and operation fails for below error.

 

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token 165655 for syxxxx) can't be found in cache
at org.apache.hadoop.ipc.Client.call(Client.java:1466)
at org.apache.hadoop.ipc.Client.call(Client.java:1403)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy22.mkdirs(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:539)
at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy23.mkdirs(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:3075)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:3042)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:956)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:952)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:952)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:945)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1856)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:606)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:282)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:648)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:124)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:523)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
at com.barclays.rna.riskpub.radial.sparkcontext.SparkContextFactory.createSQLContext(SparkContextFactory.java:139)
at com.barclays.rna.riskpub.radial.sparkcontext.SparkContextFactory.getSQLContextInstance(SparkContextFactory.java:239)
at com.barclays.rna.riskpub.riskstore.parquetnormaliser.SparkParquetBuilder$1.run(SparkParquetBuilder.java:166)
... 3 more

at com.barcap.radial.avro.exporter.AvroExporter.completeValid(AvroExporter.java:316)
at com.barcap.radial.avro.exporter.AvroExporter.complete(AvroExporter.java:324)
at com.barcap.radial.actor.impl.SliceImpl.processComplete(SliceImpl.java:777)
at com.barcap.radial.actor.impl.SliceImpl.process(SliceImpl.java:328)
at com.barcap.radial.RadialProcessorRunnable.run(RadialProcessorRunnable.java:39)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-07-04 01:02:47,740 - hdfs.WebHDFS - ERROR Processor : 8877 - Error writing to name node: http://:50070/webhdfs/v1/user/syds/sourcesystemtype=INSIGHT/sourcesysteminstance=uat2b/sessionid=07bda546-a703-4a14-b36d-0812d78601c3/table=impliedmarketdata/0.avro?op=CREATE&overwrite=false
Jul 04, 2016 1:02:48 AM com.solacesystems.jcsmp.impl.SessionModeSupportClient getMessageProducer
INFO: Closing existing XMLMessageProducer, new instance requested.

 

 

Please advise.

Cloudera Employee
Posts: 7
Registered: ‎02-08-2015

Re: data load to hdfs using parquet file

Hi LeoS,

 

Can you advise what command you're running that results in this error?

 

Thanks!

Announcements
The Kite SDK is a collection of docs, sample code, APIs, and tools to make Hadoop application development faster. Learn more at http://kitesdk.org.