Member since
11-21-2013
13
Posts
1
Kudos Received
0
Solutions
11-14-2016
03:04 AM
I'm pretty sure that this configuration should go not to Balancer "Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml", but in all Datanodes "Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml" and DNs should be restarted after configuration changes.
... View more
07-22-2015
12:34 AM
Thanks for your reply! We are generating HFiles and then doing this: LoadIncrementalHFiles loader = new LoadIncrementalHFiles(conf);
loader.doBulkLoad(new Path(hfilePath), table); 'chown' files before bulkload is one way to do that, thanks for the information! I just wonder if it's the best way. Since not all of our clusters use secure mode, is it possible to implement SecureBulkLoad on non-secure cluster?
... View more
07-21-2015
07:03 AM
Hello, CDH 5.3.3 distribution I have following error messages repeating in HBase master logs: 2015-07-21 16:36:20,964 WARN org.apache.hadoop.hbase.backup.HFileArchiver: Failed to archive class org.apache.hadoop.hbase.backup.HFileArchiver$FileablePath, file:hdfs://server:8020/hbase/data/default/olap_cube/4dc9d309b1a9089
5260662f493799e45/metric/fcd52caf2c364269939dff6c4962e76b_SeqId_168130436_ on try #2
org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode="/hbase/data/default/olap_cube/4dc9d309b1a90895260662f493799e45/metric/fcd52caf2c364269939dff6c4962e76b_SeqId_168130436_":justAnotherUser:hado
op:-rw-r--r--
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:151)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6287)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6269)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6194)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setTimesInt(FSNamesystem.java:2159)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setTimes(FSNamesystem.java:2137)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setTimes(NameNodeRpcServer.java:991)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.setTimes(AuthorizationProviderProxyClientProtocol.java:576)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setTimes(ClientNamenodeProtocolServerSideTranslatorPB.java:885)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
at sun.reflect.GeneratedConstructorAccessor14.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at org.apache.hadoop.hdfs.DFSClient.setTimes(DFSClient.java:2756)
at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1304)
at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1300)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.setTimes(DistributedFileSystem.java:1300)
at org.apache.hadoop.hbase.util.FSUtils.renameAndSetModifyTime(FSUtils.java:1678)
at org.apache.hadoop.hbase.backup.HFileArchiver$File.moveAndClose(HFileArchiver.java:586)
at org.apache.hadoop.hbase.backup.HFileArchiver.resolveAndArchiveFile(HFileArchiver.java:425)
at org.apache.hadoop.hbase.backup.HFileArchiver.resolveAndArchive(HFileArchiver.java:335)
at org.apache.hadoop.hbase.backup.HFileArchiver.resolveAndArchive(HFileArchiver.java:347)
at org.apache.hadoop.hbase.backup.HFileArchiver.resolveAndArchive(HFileArchiver.java:284)
at org.apache.hadoop.hbase.backup.HFileArchiver.archiveRegion(HFileArchiver.java:137)
at org.apache.hadoop.hbase.backup.HFileArchiver.archiveRegion(HFileArchiver.java:75)
at org.apache.hadoop.hbase.master.CatalogJanitor.cleanParent(CatalogJanitor.java:333)
at org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:254)
at org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:101)
at org.apache.hadoop.hbase.Chore.run(Chore.java:87)
at java.lang.Thread.run(Thread.java:724) I change ownership of these files manually to 'hbase' user, but after a while errors return with just different files (for other HBase tables too). I suspect, that this problem occurs because HBase wants to compact these files, but permissions are incorrect. Our Oozie workflows containing bulkload are run with 'justAnotherUser' instead of 'hbase' user. This is because we prepare data and load it in HBase in same workflow and we require, that the running user is 'justAnotherUser'. Did I guess it right? What should I do to load data to HBase correctly? Should I separate data preparation and bulkload to a different workflows? Maybe it is possible to specify user for bulkload alone? Or is it completely different issue.
... View more
Labels:
- Labels:
-
Apache HBase
08-25-2014
03:46 AM
The thing is, that this query is only selecting one partition. Since parquet table is identical (except file format), we are excepting only one partition to be written in too. It's a shame that a long time passed since first answer. I am not able to check if SHUFFLE and NOSHUFFLE keywords will help in this situation. But I will accept this answer.
... View more
05-30-2014
06:48 AM
1 Kudo
Platform info: CDH 4.6.0 (without CM). Server version: impalad version 1.3.1-cdh4 RELEASE (build 907481bf45b248a7bb3bb077d54831a71f484e5f) Query that hangs: set PARQUET_COMPRESSION_CODEC=gzip; INSERT INTO TABLE t2 PARTITION(dt) SELECT * FROM t WHERE dt='2014-05-27-00'; Info about tables: t - parquet format, without any compression ~9.9GB data. t2 - schema is copied from t table - parquet format. Have inserted same data to other table with set PARQUET_COMPRESSION_CODEC=snappy; and it worked well. But gzip comppresion is somehow hanging whole query. Query profile log hanged on this: http://pastebin.com/MWcpUQiA impala-server.log has this to say: FSDataOutputStream#close error: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /user/hive/warehouse/db.db/table/.impala_insert_staging/ad4de1e28a843230_b5aeba0046576e96/.ad4de1e28a843230-b5aeba0046576e97_1592503968_dir/dt=2014-05-27-00/ad4de1e28a843230-b5aeba0046576e97_1536053034_data.2: File does not exist. Holder DFSClient_NONMAPREDUCE_-1006280791_1 does not have any open files. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2543) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2360) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2273) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:501) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746) at org.apache.hadoop.ipc.Client.call(Client.java:1238) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at $Proxy9.addBlock(Unknown Source) at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) at $Proxy9.addBlock(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:291) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1177) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1030) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:488) E0530 15:40:03.365649 19261 impala-beeswax-server.cc:380] unknown query id: ad4de1e28a843230:b5aeba0046576e96 P.S. insert code button opens empty popup.
... View more
03-03-2014
12:46 AM
If by "catalog service's hive-site.xml" you mean /etc/imapala/hive-site.xml then I just tried it, restarted catalog service and it still gives me the same error message. And that message pops in an instance after pressing "enter" to submit query (describe tableX), like there is no timeout set.
... View more
02-26-2014
06:47 AM
[hostname:21000] > describe tableX; Query: describe tableX ERROR: AnalysisException: Failed to load metadata for table: default.tableX CAUSED BY: TableLoadingException: TableLoadingException: Failed to load metadata for table: tableX CAUSED BY: TTransportException: java.net.SocketTimeoutException: Read timed out CAUSED BY: SocketTimeoutException: Read timed out When I involve any table with many partitions (like tableX - 6k+ partitions) in any query it always fail with error "Read time out". Other queries run fine. I know that having a lot of partitions is not a good idea, but since doing anything about partition count isn't an option, is there anything I can do for now? I have tried REFRESH tableX, still, can't even execute describe tableX query. USING IMPALA 1.2.3 + CDH 4.4.0
... View more
Labels: