Member since
09-24-2015
816
Posts
488
Kudos Received
189
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3173 | 12-25-2018 10:42 PM | |
| 14195 | 10-09-2018 03:52 AM | |
| 4764 | 02-23-2018 11:46 PM | |
| 2481 | 09-02-2017 01:49 AM | |
| 2914 | 06-21-2017 12:06 AM |
05-10-2016
02:48 PM
Mark, here is what I did: Having a csv file as an input I created a Hive text table, loaded csv file onto the text table, created an Avro table "STORED AS AVRO", and inserted all recored from the text table into the avro table. I also tried to create an Avro table using a schema file. Then I tested external tables using avro table data and scheme file. I'll create an article and post this week.
... View more
05-10-2016
02:23 PM
You are missing the "LOCATION" for your external table. Uplaod your Avro file somewhere in HDFS and provide the directory where your Avro file is located as your LOCATION. There is also no need to declare name, time etc., they are given in your avsc file. See my answer. After that try "SELECT * FROM tweets LIMIT 10;".
... View more
05-10-2016
02:19 PM
No, they are defaults, they are valid even if you haven't specified them in your hdfs-site file. No need to change anything, just go ahead and install SmartSense.
... View more
05-10-2016
02:10 PM
As far as I know there is no need to change anything in your cluster in ordert to install SmartSense. The properties you mentioned have completely acceptable defaults: dfs.namenode.safemode.threshold-pct=0.999f
dfs.namenode.replication.min=1 I usually set threshold-pct to 0.99f but 0.999 is fine as well. One of the purposes of SmartSense is to give you advice on your setup, so you can just go ahead, install it and try it.
... View more
05-10-2016
11:08 AM
If you already have your Avro file and Avro schema, upload them to HDFS and use CREATE EXTERNAL TABLE my_avro_tbl
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION '/user/...'
TBLPROPERTIES ('avro.schema.url'='hdfs://name-node.fqdn:8020/user/.../schema.avsc');
If your Avro file already contains the schema in its header you can just say CREATE EXTERNAL TABLE tbl-name(... declarations ...) STORED AS AVRO LOCATION '...'; without specifying the schema. I have been testing it last few days and can confirm that it works on HDP-2.4 (Hive-1.2) for all scalar types like string, int, float, double, boolean etc. If you are using some complex types (like union) it might not work.
... View more
05-10-2016
08:26 AM
No problems, it can happen to the best of us. Also glad to hear that it worked for your, because many people have been having troubles with Oozie Sqoop actions recently. By the way, for me it worked without specifying "archive" in the workflow file, I just uploaded mysql-connector jar to Oozie Sqoop share lib. Happy sqooping!
... View more
05-10-2016
07:02 AM
1 Kudo
Can you try to run in the debug mode: "hbase shell -d", and see are there any clues. Is your Zookeeper up and running? If you are using Ambari can you try ZK and HBase service checks. hbase shell is supposed to work even if HBase is stopped (but you want be able to see any tables).
... View more
05-10-2016
01:28 AM
If you have only one master, no HA of any components, and a few slaves, I'd use only one ZK on the master. If you have 2 masters and let's say 5-6 slaves, you can configure NN and RM HA, and install 3 ZKs, two on two masters and one on one of the slaves. So, you can decide based on HA in cluster: if no HA then only 1 ZK, if you have HA, like NN HA then 3 ZKs.
... View more
05-10-2016
01:01 AM
3 Kudos
Your sqoop command begins with "sqoop": No such sqoop tool: sqoop. See 'sqoop help'.Intercepting System.exit(1) Remove "sqoop" from the command, it should start with "import", and retry.
... View more
05-09-2016
01:05 PM
Please try what I suggested, to replace VARCHAR with STRING in Hive declaration. Also, make sure the number and type of fields match those in your DB. I believe that your Avro tables are correct, but the Hive cannot read the files using your table definition. Also, it will be good idea to test everything on a smaller table, if you have one, with unique types, in your case: string, int and boolean. And, by the way, Hive is not using the .avsc file.
... View more