Member since
05-19-2017
5
Posts
0
Kudos Received
0
Solutions
11-06-2018
05:50 PM
Found the details here: https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html NSQUOTA The name quota is a hard limit on the number of file and directory names in the tree rooted at that directory. 1 = A quota of one forces a directory to remain empty. E.g. /apps/hive/warehouse -1 = No quota assigned null = Always null for root dir i.e. / 0 = ??? DSQUOTA The space quota is a hard limit on the number of bytes used by files in the tree rooted at that directory. 0 = A quota of zero still permits files to be created, but no blocks can be added to the files. -1 = No quota assigned
... View more
11-06-2018
12:03 AM
I have taken a dump of fsimage using OIV tool and NSQUOTA and DSQUOTA fields always have one of the following values NSQUOTA
1 null -1 0 DSQUOTA
-1 0 What does NSQUOTA and DSQUOTA represent? How to interpret these values?
... View more
Labels:
08-05-2018
02:51 AM
I am getting the following error when trying to read a file from HDFS using Spark from a zeppelin notebook org.apache.hadoop.ipc.RemoteException: token (HDFS_DELEGATION_TOKEN token 294488 for zeppelin) can't be found in cache
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
at org.apache.hadoop.ipc.Client.call(Client.java:1498)
at org.apache.hadoop.ipc.Client.call(Client.java:1398)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:818)
at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
at com.sun.proxy.$Proxy13.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2165)
at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1442)
at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1438)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1438)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1447)
at org.apache.spark.sql.execution.datasources.DataSource$anonfun$14.apply(DataSource.scala:381)
at org.apache.spark.sql.execution.datasources.DataSource$anonfun$14.apply(DataSource.scala:370)
at scala.collection.TraversableLike$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.immutable.List.flatMap(List.scala:344)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:370)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:415)
at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:352)
... 64 elided
... View more
Labels:
10-13-2017
03:41 PM
When we ingest the following data 2015-11-01 21:10:00.1
2015-11-01 21:10:00.1190011
2015-11-01 21:10:00.12
2015-11-01 21:10:00.123
2015-11-01 21:10:00.1234
2015-11-01 21:10:00.12345
2015-11-01 21:10:00.123456789
2015-11-01 21:10:00.490155
2015-11-01 21:10:00.1234567890
2015-11-01 21:10:00.1234567890123456789
I get the following when I do the "select", I get NULL for the last two rows instead of just truncating the additional digits. This is HDP 2.6.1 & Hive 1.2.1000 select * from test_timestamp;
+--------------------------------+--+
| test_timestamp.col |
+--------------------------------+--+
| 2015-11-01 21:10:00.1 |
| 2015-11-01 21:10:00.1190011 |
| 2015-11-01 21:10:00.12 |
| 2015-11-01 21:10:00.123 |
| 2015-11-01 21:10:00.1234 |
| 2015-11-01 21:10:00.12345 |
| 2015-11-01 21:10:00.123456789 |
| 2015-11-01 21:10:00.490155 |
| NULL |
| NULL |
... View more
05-19-2017
11:47 PM
Can you elaborate why you had to create a custom processor? GitHub project has no documentation or readme.
... View more