Member since
01-27-2015
16
Posts
5
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
19861 | 05-11-2015 09:37 AM | |
2019 | 05-07-2015 08:34 AM | |
6772 | 04-28-2015 09:02 AM | |
6962 | 03-17-2015 11:39 AM |
05-11-2015
09:37 AM
COLUMNS_OLD is a deprecated table where columns used to be stored. Hive might have some information there for some reason. You can use both COLUMNS_OLD or COLUMNS_V2 when searching for your column.
... View more
05-07-2015
08:34 AM
1 Kudo
I recommend the 2nd option where you have 3 columns only: (PK, DATE, MEASURE). You cannot update records on Hive, so having the 365 columns will leave 364 columns unused, and this causes extra storage on your files (like separators chars, schema information, etc). Also, for read performance, 3 columns is still better than 365. Hive reads the full record every time you do a query, it then selects the columns you want, and applies the filter from the WHERE statement. This select/filter will happen with 3 or 365 columns, so 3 will be faster. Also, you're queries would be shorter, as you only need to filter the query by date (instead of looking for columns that have measure data). And, if you use columnar storage files (like Parquet), this filter may be faster.
... View more
05-07-2015
08:14 AM
1 Kudo
Indexing won't work to connect two tables. This is used for speed performance when searching data on tables. I was taking a look at the tables on the metastore, and there are tables like SKEWED_COL_NAMES, PART_COL_PRIVS, etc. Those contain the column name as well. How is the table you're looking for configured? Is it partitioned? Is it skewed?
... View more
05-06-2015
12:43 PM
You can use the following statment: ALTER TABLE TABLE_NAME ADD INDEX (COLUMN_NAME); Is the query slow on your system? Could you paste the the output of the 'EXPLAIN SELECT ...' of your query?
... View more
05-01-2015
08:51 AM
1 Kudo
Here's the query you can use on the metastore: select TBL_NAME, COLUMN_NAME, TYPE_NAME from TBLS left join COLUMNS_V2 on CD_ID = TBL_ID where COLUMN_NAME like 'column'; where 'column' is the column name you're looking for.
... View more
04-28-2015
09:02 AM
1 Kudo
Hi, This issue has been fixed on 5.3.3 and 5.4.0.
... View more
03-17-2015
03:43 PM
You're welcome. Btw, we're expecting to have this fix for 5.3.3 instead.
... View more
03-17-2015
11:39 AM
1 Kudo
Hi, You just hit a bug related with the encryption changes added to CDH 5.3.2. This happens because an internal function is trying to check if the table location is encrypted or not. This works only for HDFS files, but not for external locations. I'm afraid the only way to make it work is to downgrade to CDH 5.3.1. I don't see a workaround to bypass this error. The next CDH 5.4.0 version will include this bugfix.
... View more