Member since
07-31-2013
1924
Posts
462
Kudos Received
311
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1968 | 07-09-2019 12:53 AM | |
| 11846 | 06-23-2019 08:37 PM | |
| 9132 | 06-18-2019 11:28 PM | |
| 10109 | 05-23-2019 08:46 PM | |
| 4566 | 05-20-2019 01:14 AM |
09-09-2016
09:34 AM
Hi, I did install from scratch a new cluster using m4 instance type and I could not reproduce the error. Thanks.
... View more
09-08-2016
04:07 AM
I was using this repo http://archive.cloudera.com/kafka/redhat/7/x86_64/kafka/cloudera-kafka.repo which contains an outdated path (baseurl=http://archive.cloudera.com/kafka/redhat/7/x86_64/kafka/1/) I changed it to point to 2.0.2 and I got kafka 0.9 thanks!
... View more
09-06-2016
03:02 AM
Ok, I managed to make a HBase Bulk Load using Hive. There is a wiki article on that : https://cwiki.apache.org/confluence/display/Hive/HBaseBulkLoad The procedure described there do not work. I guess it was made for older version of hive and HBase. With some work in order to adapt the procedure I managed to load an HBase table using the completebulkload. Here comes a working sample on that matter : sudo -u hdfs hdfs dfs -put -f /opt/cloudera/parcels/CDH/lib/hive/lib/hbase-client.jar /user/hive/
sudo -u hdfs hdfs dfs -put -f /opt/cloudera/parcels/CDH/lib/hive/lib/hbase-server.jar /user/hive/
sudo -u hdfs hdfs dfs -put -f /opt/cloudera/parcels/CDH/lib/hive/lib/hbase-common.jar /user/hive/
sudo -u hdfs hdfs dfs -put -f /opt/cloudera/parcels/CDH/lib/hive/lib/hbase-protocol.jar /user/hive/
sudo -u hdfs hdfs dfs -put -f /opt/cloudera/parcels/CDH/lib/hive/lib/hive-hbase-handler.jar /user/hive/
# These JARs need to be added to HiveServer2 with the property hive.aux.jars.path
sudo -u hdfs hdfs dfs -chmod 554 /user/hive/*.jar
sudo -u hdfs hdfs dfs -chown hive:hive /user/hive/*.jar
total=`beeline -n sp35517 -p "" -u "jdbc:hive2://dn060001:10000/default" --outputformat=csv2 --silent=true -e "SELECT count(*) FROM default.operation_client_001;"`
total=`echo $total | cut -d ' ' -f 2- `
hdfs dfs -rm -r /tmp/hb_range_keys
hdfs dfs -mkdir /tmp/hb_range_keys
beeline -n sp35517 -p "" -u "jdbc:hive2://dn060001:10000/default" -e "CREATE EXTERNAL TABLE IF NOT EXISTS default.hb_range_keys(transaction_id_range_start string) row format serde 'org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe' stored as inputformat 'org.apache.hadoop.mapred.TextInputFormat' outputformat 'org.apache.hadoop.hive.ql.io.HiveNullValueSequenceFileOutputFormat' location '/tmp/hb_range_keys';"
beeline -n sp35517 -p "" -u "jdbc:hive2://dn060001:10000/default" -e "add jar /opt/cloudera/parcels/CDH/lib/hive/lib/hive-contrib.jar; create temporary function row_sequence as 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence'; INSERT OVERWRITE TABLE default.hb_range_keys SELECT a.id FROM ( SELECT row_sequence() as num, t.id FROM default.operation_client_001 t order by t.id) a WHERE ( a.num % ( round( ${total} / 12) ) ) = 0;"
hdfs dfs -rm -r /tmp/hb_range_key_list;
hdfs dfs -cp /tmp/hb_range_keys/* /tmp/hb_range_key_list;
hdfs dfs -rm -r /tmp/hbsort;
hdfs dfs -mkdir /tmp/hbsort;
beeline -n sp35517 -p "" -u "jdbc:hive2://dn060001:10000/default" -e "set mapred.reduce.tasks=12; set hive.mapred.partitioner=org.apache.hadoop.mapred.lib.TotalOrderPartitioner; set total.order.partitioner.path=/tmp/hb_range_key_list; set hfile.compression=gz; CREATE TABLE IF NOT EXISTS default.hbsort (id string, id_courtier string, cle_recherche string, cle_recherche_contrat string, nom_sous string, nom_d_usage string, prenom_sous string, date_naissance_sous string, id_contrat string, num_contrat string, produit string, fiscalite string, dt_maj string, souscription timestamp, epargne double, dt_ope_ct timestamp, type_ope_ct string, montant string, frais string, dt_ope_ct_export string, souscription_export string, montant_export string, frais_export string, montant_encours_gbl_ct_export string ) STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.hbase.HiveHFileOutputFormat' TBLPROPERTIES ('hfile.family.path' = '/tmp/hbsort/ti');"
beeline -n sp35517 -p "" -u "jdbc:hive2://dn060001:10000/default" -e "INSERT OVERWRITE TABLE hbsort select t.* from default.operation_client_001 t cluster by t.id;"
sudo -u hdfs hdfs dfs -chgrp -R hbase /tmp/hbsort
sudo -u hdfs hdfs dfs -chmod -R 775 /tmp/hbsort
export HADOOP_CLASSPATH=`hbase classpath`
hadoop jar /opt/cloudera/parcels/CDH/lib/hive/lib/hbase-server.jar completebulkload /tmp/hbsort default_operation_client_001 c
... View more
08-29-2016
08:56 PM
I'd recommend looking for WARN or higher logs with the reference "Checkpoint" in them, to find why it aborts mid-way frequently. There were some timeout associated issues in very early CDH4 period, but I've not seen this issue repeat with CDH5, even for very large fsimages.
... View more
08-29-2016
02:44 AM
Thanks for this clarification. I got my answer.
... View more
08-24-2016
09:04 PM
1 Kudo
Adding onto @dice's post, this WARN does not impair any current functionality your HDFS is performing. It can be ignored until you are able to grab the bug-fix via the update to 5.7.2 or higher. See also the past community topic on the same question: http://community.cloudera.com/t5/Storage-Random-Access-HDFS/quot-Report-from-the-DataNode-datanodeUuid-is-unsorted-quot/m-p/41943#M2188
... View more
08-22-2016
11:24 PM
1 Kudo
Hi, Harsh, the issue gone when i package the sunjce_provider.jar of JRE into lib folder. Thanks BR Paul
... View more
08-19-2016
09:58 PM
Thanks, good to know.
... View more
08-16-2016
01:53 AM
Thanks, you are right. I just discovered that there are two kadmin packages installed for unknown reason. Maybe it is because I changed the PATH variable once and installed the kadmin in other path where is different from the default path setting in CM. I solved the problem with correcting the PATH variable and reinstalling the package. Once again, thank you for you help
... View more