About mathieu.d

mathieu.d · ‎12-09-2016

Yes there is. In Cloudera Manager : - Go to the Key value store indexer configuration > service wide > advanced - add in the "Key-value store indexer service environnement advanced configuration snippet (safety valve)" the following information : HBASE_INDEXER_CLASSPATH=<your_classpath> Restart the service. regards, Mathieu

mathieu.d · ‎10-11-2016

Hi, You're right, this is not something you will want to automate. This is only a workaround when you can't afford a reboot of hive (for example, when other queries not related to the particular lock are being processed). And yes, I was refering to the professional support where you can "submit" an issue you're facing for Cloudera to analyze. Good luck.

mathieu.d · ‎10-10-2016

Hi, When it happens again you can workaround the issue by deleting the lock inside zookeeper. This will be easier and quicker than restarting Hive. But this will not solve the issue. For this kind of tricky issue I would open a ticket for Cloudera support. regards, mathieu

mathieu.d · ‎10-05-2016

Hi, Maybe you could share the java source code doing the connection ? Here is a really small working sample : public class ManageHive { private static String driverName = "org.apache.hive.jdbc.HiveDriver"; private static Logger logger = Logger.getLogger(ManageHive.class); public static Connection getConnection(LoadProperties prop, String user) throws ClassNotFoundException, SQLException { String hiveJdbc = prop.getPropertyByName("hive_jdbc"); try { Class.forName(driverName); } catch (ClassNotFoundException e) { e.printStackTrace(); throw e; } Connection conn2 = DriverManager.getConnection(hiveJdbc+"/extraction", user, ""); return conn2; } public static void execSql(LoadProperties prop, String user, String sql) throws SQLException, ClassNotFoundException { Connection maConn = getConnection(prop,user); Statement stmt = maConn.createStatement(); int result = stmt.executeUpdate(sql); if ( result == Statement.EXECUTE_FAILED ) { throw new SQLException("Erreur d'execution."); } } }

mathieu.d · ‎09-08-2016

After some more testing I found that the following command is working: split '<namespace>:<table_name>', 'NEW_SPLIT_VALUE' I just need to call it once per "pre-split" value I need.

mathieu.d · ‎09-08-2016

Hi, I'm using CDH 5.5.2. So my version of HBase is not the same. I did try the | thing into the hbase shell command. I'm trying to "pre-split" an existing empty table. But the following command seems to not be correct for this version of HBase : alter '<namespace>:<table_name>',{ SPLITS => ['value1','value2'] } I get the following message, so I guess the "SPLITS" part is not taken into account : Unknown argument ignored: SPLITS Updating all regions with the new schema... 1/2 regions updated. 2/2 regions updated. Done. 0 row(s) in 2.4780 seconds Does someone knows the syntax for pre-splitting and already existing table (empty table) ? For CDH 5.5.2, it use HBase 1.0.0 I think.

mathieu.d · ‎09-06-2016

Ok, I managed to make a HBase Bulk Load using Hive. There is a wiki article on that : https://cwiki.apache.org/confluence/display/Hive/HBaseBulkLoad The procedure described there do not work. I guess it was made for older version of hive and HBase. With some work in order to adapt the procedure I managed to load an HBase table using the completebulkload. Here comes a working sample on that matter : sudo -u hdfs hdfs dfs -put -f /opt/cloudera/parcels/CDH/lib/hive/lib/hbase-client.jar /user/hive/ sudo -u hdfs hdfs dfs -put -f /opt/cloudera/parcels/CDH/lib/hive/lib/hbase-server.jar /user/hive/ sudo -u hdfs hdfs dfs -put -f /opt/cloudera/parcels/CDH/lib/hive/lib/hbase-common.jar /user/hive/ sudo -u hdfs hdfs dfs -put -f /opt/cloudera/parcels/CDH/lib/hive/lib/hbase-protocol.jar /user/hive/ sudo -u hdfs hdfs dfs -put -f /opt/cloudera/parcels/CDH/lib/hive/lib/hive-hbase-handler.jar /user/hive/ # These JARs need to be added to HiveServer2 with the property hive.aux.jars.path sudo -u hdfs hdfs dfs -chmod 554 /user/hive/*.jar sudo -u hdfs hdfs dfs -chown hive:hive /user/hive/*.jar total=`beeline -n sp35517 -p "" -u "jdbc:hive2://dn060001:10000/default" --outputformat=csv2 --silent=true -e "SELECT count(*) FROM default.operation_client_001;"` total=`echo $total | cut -d ' ' -f 2- ` hdfs dfs -rm -r /tmp/hb_range_keys hdfs dfs -mkdir /tmp/hb_range_keys beeline -n sp35517 -p "" -u "jdbc:hive2://dn060001:10000/default" -e "CREATE EXTERNAL TABLE IF NOT EXISTS default.hb_range_keys(transaction_id_range_start string) row format serde 'org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe' stored as inputformat 'org.apache.hadoop.mapred.TextInputFormat' outputformat 'org.apache.hadoop.hive.ql.io.HiveNullValueSequenceFileOutputFormat' location '/tmp/hb_range_keys';" beeline -n sp35517 -p "" -u "jdbc:hive2://dn060001:10000/default" -e "add jar /opt/cloudera/parcels/CDH/lib/hive/lib/hive-contrib.jar; create temporary function row_sequence as 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence'; INSERT OVERWRITE TABLE default.hb_range_keys SELECT a.id FROM ( SELECT row_sequence() as num, t.id FROM default.operation_client_001 t order by t.id) a WHERE ( a.num % ( round( ${total} / 12) ) ) = 0;" hdfs dfs -rm -r /tmp/hb_range_key_list; hdfs dfs -cp /tmp/hb_range_keys/* /tmp/hb_range_key_list; hdfs dfs -rm -r /tmp/hbsort; hdfs dfs -mkdir /tmp/hbsort; beeline -n sp35517 -p "" -u "jdbc:hive2://dn060001:10000/default" -e "set mapred.reduce.tasks=12; set hive.mapred.partitioner=org.apache.hadoop.mapred.lib.TotalOrderPartitioner; set total.order.partitioner.path=/tmp/hb_range_key_list; set hfile.compression=gz; CREATE TABLE IF NOT EXISTS default.hbsort (id string, id_courtier string, cle_recherche string, cle_recherche_contrat string, nom_sous string, nom_d_usage string, prenom_sous string, date_naissance_sous string, id_contrat string, num_contrat string, produit string, fiscalite string, dt_maj string, souscription timestamp, epargne double, dt_ope_ct timestamp, type_ope_ct string, montant string, frais string, dt_ope_ct_export string, souscription_export string, montant_export string, frais_export string, montant_encours_gbl_ct_export string ) STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.hbase.HiveHFileOutputFormat' TBLPROPERTIES ('hfile.family.path' = '/tmp/hbsort/ti');" beeline -n sp35517 -p "" -u "jdbc:hive2://dn060001:10000/default" -e "INSERT OVERWRITE TABLE hbsort select t.* from default.operation_client_001 t cluster by t.id;" sudo -u hdfs hdfs dfs -chgrp -R hbase /tmp/hbsort sudo -u hdfs hdfs dfs -chmod -R 775 /tmp/hbsort export HADOOP_CLASSPATH=`hbase classpath` hadoop jar /opt/cloudera/parcels/CDH/lib/hive/lib/hbase-server.jar completebulkload /tmp/hbsort default_operation_client_001 c

mathieu.d · ‎08-17-2016

Hi, I think sentry check if your user have specific permission on the "LOCATION" URI you have provided (and this is not related to HDFS ACL). Try to grant, in sentry, that permission too. For example : GRANT ALL ON URI 'hdfs://hdfscluster/user/testuser/part' TO ROLE <a_role>; regards, mathieu

mathieu.d · ‎08-16-2016

Yes, this one.

mathieu.d · ‎08-16-2016

For those interested : the issue was confirmed by the support with no workaround until the jira ticket listed is fixed.

Online	Offline
Last Visited	‎01-17-2018 02:52 AM

Member Since	‎07-16-2015 01:41 AM
Last Visited	‎01-17-2018 02:52 AM
Posts	177
Kudos received	28

Cloudera Community

Re: Unable to delete HDFS Corrupt files

Re: Hive partitions based on date from timestamp

Re: Partition Hive Table to Hbase Handler ?

Re: yarn logs location on disk

Re: Increase Flume graceful restart time

Re: How do I add a library to hbase-indexer classp...

Re: Hive lock left behind

Re: Hive lock left behind

Re: Exception in thread "main" java.sql.SQLExcepti...

Re: HBase - alter table - add pre-splits

Re: HBase - alter table - add pre-splits

Re: HBase slow bulk loading using Hive

Re: How to create external table without serveradm...

Re: Yarn Physical to Virtual Core multiplier

Re: HBase slow bulk loading using Hive