Member since
09-29-2015
28
Posts
14
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1687 | 01-03-2017 10:36 PM | |
4214 | 12-30-2016 12:05 AM | |
5610 | 07-14-2016 06:51 PM |
06-04-2018
04:37 PM
Have recently run into multiple issues where ORC files on hive are not getting compacted. There are a couple of parameters required to enable concat on ORC. SET hive.merge.tezfiles=true; SET hive.execution.engine=tez; SET hive.merge.mapredfiles=true; SET hive.merge.size.per.task=256000000; SET hive.merge.smallfiles.avgsize=256000000; SET hive.merge.mapfiles=true; SET hive.merge.orcfile.stripe.level=true; SET mapreduce.input.fileinputformat.split.minsize=256000000; SET mapreduce.input.fileinputformat.split.maxsize=256000000; SET mapreduce.input.fileinputformat.split.minsize.per.node=256000000; SET mapreduce.input.fileinputformat.split.minsize.per.rack=256000000; ALTER TABLE <table_name> SET TBLPROPERTIES('EXTERNAL'='FALSE'); alter table <table_name> partition ( file_date_partition='<partition_info>') concatenate; ALTER TABLE <table_name> SET TBLPROPERTIES('EXTERNAL'='TRUE'); mapreduce.input.fileinputformat.split.minsize.per.node Specifies the minimum number of bytes that each input split should contain within a data node. The default value is 0, meaning that there is no minimum size mapreduce.input.fileinputformat.split.minsize.per.rack Specifies the minimum number of bytes that each input split should contain within a single rack. The default value is 0, meaning that there is no minimum size Make sure not to concat orc files if they are generated by spark as there is a know issue HIVE-17403 and hence being disabled in later versions. Example of this is a table/partition having 2 different files files (part-m-00000_1417075294718 and part-m-00018_1417075294718). Although both are completely different files, hive thinks these are files generated by separate instances of same task (because of failure or speculative execution). Hive will end up removing this file
... View more
Labels:
05-25-2018
06:12 PM
3 Kudos
PROBLEM Users able to drop table on hive though they are not the table owners. Need to enable metastore server security to start using the storage based auth. SOLUTION To enable metastore security we need to enable the following parameter hive.metastore.pre.event.listeners [This turns on metastore-side security.] Set to org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener hive.security.metastore.authorization.manager [This tells Hive which metastore-side authorization provider to use. The default setting uses DefaultHiveMetastoreAuthorizationProvider, which implements the standard Hive grant/revoke model. To use an HDFS permission-based model (recommended) to do your authorization, use StorageBasedAuthorizationProvider as instructed above] Set to org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider hive.security.metastore.authenticator.manager Set to org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator hive.security.metastore.authorization.auth.reads When this is set to true, Hive metastore authorization also checks for read access. It is set to true by default. Read authorization checks were introduced in Hive 0.14.0
... View more
Labels:
04-18-2017
06:38 PM
@sanket patel intermittent zk issues can lead to cleaner chors failing. https://issues.apache.org/jira/browse/HBASE-15234
... View more
03-28-2017
12:14 AM
I would recommend to split up the file and then the MR job of yours on each of the file.
... View more
02-10-2017
05:58 PM
@Subramanian Santhanam can you please add more details with screenshots and logs?
... View more
02-05-2017
05:57 PM
Can you provide the container logs?
... View more
01-05-2017
03:53 AM
please check if the heap configuration is as per the recommendation here https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_installing_manually_book/content/ref-80953924-1cbf-4655-9953-1e744290a6c3.1.html
... View more
01-03-2017
10:36 PM
@Mahen Jay can you please elaborate more on the usecase here? Do you already have 3 Zookeeper nodes and looking to add more on later stage? If this is the case, then yes you can always add more ZK after the cluster is created. Or are you saying to just skip zookeeper nodes for now? If this is the case then I do not think it would be possible as it is a depended service. You can always move the ZK nodes at a later stage to other machines. You need to have ZK nodes at the time of cluster creation. Please let me know if the use case is different.
... View more
01-03-2017
10:24 PM
@David Sheard great that it worked. Can you please accept the answer 🙂
... View more
01-03-2017
10:22 PM
1 Kudo
User tries to decommission/recommission nodes from Ambari UI, nothing happens on the UI and it seems like the operation did not go through. Ambari-server Logs: WARN [C3P0PooledConnectionPoolManager[identityToken->2s8bny9j1mxgjkn9oj5d8|79679221]-HelperThread-#0] StatementUtils:223 - Statement close FAILED.com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'OPTION SQL_SELECT_LIMIT=DEFAULT' at line 1
at sun.reflect.GeneratedConstructorAccessor198.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.Util.getInstance(Util.java:386)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1052) ROOT CAUSE
The default JDBC driver installed with Ambari doesn't support MySQL 5.6.25. NOTE BEFORE THE WORKAROUND CAN BE FOLLOWED
Make sure to delete the triggers from Ambari db before we follow the steps in the workaround section. Else it might result into an outage if there are to many triggers waiting in DB to be triggered when the connector version is fixed. Ambari Db tables to check:
qrtz_calendars qrtz_fired_triggers qrtz_job_details qrtz_locks qrtz_paused_trigger_grps qrtz_scheduler_state qrtz_simple_triggers qrtz_simprop_triggers qrtz_triggers WORKAROUND Update mysql connector from http://mvnrepository.com/artifact/mysql/mysql-connector-java. Once the above is updated ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar where jdbc-driver is the path to the new driver.
... View more
Labels: