Member since
05-02-2017
360
Posts
65
Kudos Received
22
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
13344 | 02-20-2018 12:33 PM | |
1503 | 02-19-2018 05:12 AM | |
1859 | 12-28-2017 06:13 AM | |
7136 | 09-28-2017 09:25 AM | |
12166 | 09-25-2017 11:19 AM |
04-03-2017
02:09 PM
Thanks @mqureshi. I completely understand the concept of external table . But what i was looking for is, What are the reasons when a 'Managed Hive table' is dropped but the data underneath hive warehouse directory still exists.
... View more
04-03-2017
01:48 PM
Hi @mqureshi What might be the reason when a hive managed table is dropped but the HDFS file is not removed? Thanks in advance.
... View more
04-03-2017
02:28 AM
1 Kudo
Hi @Dinesh Chitlangia Yes as siad by @Deepesh it should work. But if you query is something like this create table as select col1,col2 from mytable; and if your col2 contains 100% null in mytable then the table will be created successfully with void as datatype. However you will not be allowed to append anymore data with valid values into it.
... View more
04-02-2017
01:56 PM
Thanks @Scott Shaw. Does it mean I have to update the metadata each time after I truncate the partition? Even if the metadata exists it should not display wrong results. In my case select distinct country from mytable should display only India.
... View more
04-02-2017
09:25 AM
I have a hive managed table which is partitioned by country. Now I have inserted data for different countries--> India, Japan, China. Total count of records in hive table is 1000. I have truncated a particular partition say for example I have truncated Japan partition, now the count of rows is reduced to 700. Then If I run the query select distinct country from mytable it displays India,China and Japan even i have no data in the partition country=Japan. I have checked the hadoop file location as well and there are no files underneath the partition. But the distinct of the partitioned column displays Japan as well which should not be. Distinct values of table should return based on the data available on the partitioned folder created in hdfs and no the folder name in HDFS.
... View more
Labels:
- Labels:
-
Apache Hadoop
04-02-2017
07:41 AM
@sherri cheng You mean do you want to delete the folders on which the hive table is created? If its a managed table then dropping the hive table will delete the folders underneath the warehouse. But if it is a external table then you have to manual delete the folders/ files underneath.
... View more
04-01-2017
01:40 PM
@Sai Deepthi You can either delete it or fix that particular row and load it. It will solve your problem.
... View more
04-01-2017
11:12 AM
@Sai Deepthi It seems there is an invisible character. Could you try opening the file hadoop fs -cat v filename. it should allow you to see the invisible character.
... View more
03-31-2017
08:17 PM
@Sai Deepthi I could see that the text is not terminated with quotes in the data and thats the reason for the error. I believe the below data for the column "msg" is not terminated. "tweet_id":847684241814978560,"created_unixtime":1490938647117,"created_time":"Fri Mar 31 05:37:27 +0000 2017","lang":"it","displayname":"PaolaGlmnn","time_zone":"Rome","msg":"Una crisalidedonde usc? If not could you share your data. Because string unterminated defines that the data/command is not completed without ending quotes.
... View more
03-31-2017
01:32 PM
@Kumar Veerapan Im not sure whether I got your question right. But If you are asking for UI to check the rate of file growth then do check this https://www.datadoghq.com/product/ Also grafana https://grafana.com/grafana Helped me a lot in terms of time series metrics.
... View more