Member since
07-31-2013
1924
Posts
462
Kudos Received
311
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1984 | 07-09-2019 12:53 AM | |
| 11940 | 06-23-2019 08:37 PM | |
| 9196 | 06-18-2019 11:28 PM | |
| 10189 | 05-23-2019 08:46 PM | |
| 4610 | 05-20-2019 01:14 AM |
02-28-2016
10:55 PM
1 Kudo
Lets say you want to execute "script.sh" 1. If you have script.sh inside your WF/lib/ path on HDFS, you just need <exec>script.sh</exec> 2. If you have script.sh on an arbitrary path on HDFS, you need: <exec>script.sh</exec> <file>/path/to/script.sh#script.sh</file> 3. Use of the below form with (1) is redundant, but the subsequent form is when you want to invoke it as a different name: <exec>script.sh</exec> <file>script.sh#script.sh</file> <exec>linked-script-name.sh</exec> <file>original-script-name.sh#linked-script-name.sh</file>
... View more
02-28-2016
08:53 AM
1 Kudo
Note: CDH3 is long past its supported lifetime. Netezza JDBC should be worth trying on the CDH3 Sqoop version. I don't recall if it worked without a specialised connector, but the generic SQL connector should likely make it go through.
... View more
02-28-2016
08:19 AM
3 Kudos
What's the DESCRIBE output of your avro_test table? If it includes a VOID column type, HCatalog currently does not support that.
... View more
02-28-2016
07:10 AM
There's no current way to do this today, aside of scripting it by using the regular SHOW GRANT commands and then parsing the output into a file and then into a table.
... View more
02-28-2016
05:47 AM
I don't see the tab character. There are lots of interleaved null characters in the file though, and the closest I can guess your delimiter to be is a double null byte sequence: \0\0. You may want to format the file right before use with Hive, if this is so. Something like the below in Python can do the trick for cleanup, for example, assuming your delimiter is indeed a double sequence of null bytes: data = data.replace('\0\0', '\t').replace('\0', '')
... View more
02-28-2016
02:15 AM
1 Kudo
CDH Hive sources are available either via GitHub at https://github.com/cloudera/hive/tree/cdh5.4.5-release/, or in tarball form under http://archive.cloudera.com/cdh5/cdh/5/. You can use a "patch" command to apply the latest patch from the JIRA, and then use "mvn" to build the updated jars. If you are a Cloudera Enterprise subscriber, please log a case with Support for any patch requests instead. Custom-patching a component will render it unsupported. P.s. ACID features are currently not supported as a feature in CDH Hive: http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_rn_hive_ki.html
... View more
02-28-2016
01:58 AM
1 Kudo
You can run an EXPLAIN on a query to see how Hive would plan to run the query (how many phases). This will help you get a sense of 'how many jobs' or something close to it. Your query is invalid in HiveQL, but with GROUP BY statements further added for col1 and col2 to make it legal, it would take a single job.
... View more
02-28-2016
01:29 AM
The Hive "Streaming" feature is built upon its unsupported [1] transactional features: https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest This feature (the ACID one) uses the tables you've mentioned, when DbTxnManager is in use as per the suggested configs. Cloudera does not recommend the use of ACID features currently, because it is experimental in stability/quality upstream [1]. But anyways, checking some code [2] if all data is compacted in your table then the entries under COMPLETED_TXN_COMPONENTS should be deleted away. Do you see any messages such as "Unable to delete compaction record" in your HMS log? Or any WARN+ log from CompactionTxnHandler class in general? Looking for that and then working over the error should help you solve this. [1] - http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_rn_hive_ki.html, specific quote: """ Hive ACID is not supported Hive ACID is an experimental feature and Cloudera does not currently support it. """ [2] - https://github.com/cloudera/hive/blob/cdh5.5.2-release/metastore/src/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java#L320, etc.
... View more
02-28-2016
01:24 AM
The first CREATE TABLE specification looks correct to me, for your described file. Can you also double-inspect your /path/file.csv with "head -n1 /path/file.csv | od -c" command to ensure it does have the actual \t character between each field (vs. using a visual editor)?
... View more
02-27-2016
09:35 PM
1 Kudo
Hive provides a skip header/footer feature when creating your table (as part of table properties). See the release notes on https://issues.apache.org/jira/browse/HIVE-5795 """ CREATE TABLE testtable (name STRING, message STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' TBLPROPERTIES ("skip.header.line.count"="1"); LOAD DATA LOCAL INPATH '/tmp/header-inclusive-file.csv' INTO TABLE testtable; """
... View more