Member since
03-04-2019
59
Posts
24
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5264 | 07-26-2018 08:10 PM | |
5922 | 07-24-2018 09:49 PM | |
2862 | 10-08-2017 08:00 PM | |
2438 | 07-31-2017 03:17 PM | |
834 | 12-05-2016 11:24 PM |
07-19-2017
11:39 PM
1 Kudo
It appears all the dependency libs are present on the master, as well as all the region nodes, so I just added a parameter 'hbase.table.sanity.checks' in hbase-site.xml, and set it to 'false' in Ambari. After that, I restarted HBaseMaster as well as all the RegionServers, then phoenix started working.
... View more
07-19-2017
10:54 PM
1 Kudo
It seems I'm running into a connection issue between Phoenix and HBase, below is the error: [root@dsun5 bin]# ./sqlline.py dsun0.field.hortonworks.com:2181:/hbase-unsecure
Setting property: [incremental, false]
Setting property: [isolation, TRANSACTION_READ_COMMITTED]
issuing: !connect jdbc:phoenix:dsun0.field.hortonworks.com:2181:/hbase-unsecure none none org.apache.phoenix.jdbc.PhoenixDriver
Connecting to jdbc:phoenix:dsun0.field.hortonworks.com:2181:/hbase-unsecure
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.6.1.0-129/phoenix/phoenix-4.7.0.2.6.1.0-129-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.6.1.0-129/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
17/07/19 22:46:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/07/19 22:46:11 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
Error: org.apache.hadoop.hbase.DoNotRetryIOException: Class org.apache.phoenix.coprocessor.MetaDataEndpointImpl cannot be loaded Set hbase.table.sanity.checks to false at conf or table descriptor if you want to bypass sanity checks
at org.apache.hadoop.hbase.master.HMaster.warnOrThrowExceptionForFailure(HMaster.java:1878)
at org.apache.hadoop.hbase.master.HMaster.sanityCheckTableDescriptor(HMaster.java:1746)
at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1652)
at org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:483)
at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:59846)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2141)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167) (state=08000,code=101) phoenixserver.jar has been manually installed on the HBase master.
... View more
Labels:
- Labels:
-
Apache HBase
-
Apache Phoenix
07-19-2017
03:00 PM
@sysadmin CreditVidya There are several approaches I can think of might help: 1. It appears MR intermediate data is not being purged properly by Hadoop itself, you can manually delete files/folders configured in mapreduce.cluster.local.dir after MR jobs are completed, say files/folders older than 3 days. You can probably create a cron job for that purpose. 2. Make sure to implement cleanup() method in each mapper/reducer class, which will clean up local resources, and aggregates before the task exists. 3. Run hdfs balancer regularly, normally weekly or bi-weekly, that way you won't have too much more hdfs data stored on some nodes comparing to the others, as MR jobs always try to use the local copy of the data first, and always keep an eye on 'disk usage' for each host in Ambari. Hope that helps.
... View more
07-19-2017
02:40 PM
1 Kudo
@sysadmin CreditVidya There are several approaches I can think of might help: 1. It appears MR intermediate data is not being purged properly by Hadoop itself, you can manually delete files/folders configured in mapreduce.cluster.local.dir after MR jobs are completed, say files/folders older than 3 days. You can probably create a cron job for that purpose. 2. Make sure to implement cleanup() method in each mapper/reducer class, which will clean up local resources, and aggregates before the task exists. 3. Run hdfs balancer regularly, normally weekly or bi-weekly, that way you won't have too much more hdfs data stored on some nodes comparing to the others, as MR jobs always try to use the local copy of the data first, and always keep an eye on 'disk usage' for each host in Ambari. Hope that helps.
... View more
07-18-2017
09:31 PM
2 Kudos
One approach you can take is to enable Hive impersonation - set ‘hive.server2.enable.doAs=false’ in Hive Configs, which will give permissions of the Hive related HDFS folders to the ‘hive’ user, and other users wouldn’t be able to access HDFS files directly. In your case, I assume you have doAs set to true, the user running the Hive query requires to have permissions defined for both HDFS and Hive in Ranger, which can be an issue if you have too many tables, as all your tables are managed under the hive/warehouse directory rather than user’s home folders, and for each table you will need to grant user permissions via HDFS policy in Ranger to the table location for the specific tables. Even you have ‘doAs’ set to true, you will still be able to see the actual user in Ranger Audit logs, and it’s just the HDFS related tasks will run as the ‘hive’ user.
... View more
07-18-2017
05:34 PM
Did you set up proper Hive resource access policies for the users/groups in Ranger? Here is a good totorial on how to set them up in Ranger: https://hortonworks.com/blog/best-practices-for-hive-authorization-using-apache-ranger-in-hdp-2-2/
... View more
07-18-2017
03:10 PM
1 Kudo
You should set the permission of Hive warehouse as 700 instead of 000, so that normal users are unable to access the secured tables, and let Ranger control the Hive policies. In addition, you will need to make sure 'hive.warehouse.subdir.inherit.perms=true', that will enforce the newly created tables inherit the 700 permission. Hope that helps.
... View more
07-18-2017
01:51 PM
@sysadmin CreditVidya Assuming you are referring to 'Non DFS Used:' in the NameNode UI page, which is the total across the whole cluster, and could be in TB's depending the size of your total storage. ALSO, that number refers to 'How much configured DFS capacity are occupied by non dfs use', here is a good article around it: https://stackoverflow.com/questions/18477983/what-exactly-non-dfs-used-means Hope that helps.
... View more
07-17-2017
05:10 PM
1 Kudo
As you have heterogeneous worker nodes, I'd recommend setting up two separate host config groups first, then manage HDFS separately. Here is the link to how to set up config groups in Ambari: https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.1.0/bk_ambari-operations/content/using_host_config_groups.html For each host group, you can config the non DSF use by setting the proper value for 'dfs.datanode.du.reserved' (in bytes per volume), normally it should be 20%- 25% of disk storage. Also, keep in mind non DFS can grow into reserved DFS storage, you should regularly delete logs and other non HDFS data that are taking large local storage, I normally use commands like 'du -hsx * | sort -rh | head -10' to identify top 10 largest folders.
... View more