About maeve_ryan226

maeve_ryan226 · ‎04-04-2017

@amankumbare Please see attached screenshot. As you can see 18.8GB is being used. This was the same amount before I deleted the table. I would have expected this to decrease by about 9GB once I droped and removed the table from trash.

maeve_ryan226 · ‎04-04-2017

Hi all, I have been using Hive on sandbox for a college project. Everything was working fine up until yesterday, when I noticed that the memory was running out due to the data I am using being rather large. To free up memory I dropped a table I no longer need and then went into Files_View/user/admin/.Trash/<table name> and deleted the table from the trash folder. However after doing so the HDFS memory is still full and has not reduced at all. I then checked the following Files_View/Apps/hive/warehouse/<database im using>/<table_name> to ensure that the table was deleted and it is gone. Does anyone know how I can permanently delete the table so that the memory also frees up within the HDFS? Thanks in advance.

maeve_ryan226 · ‎04-17-2016

@Benjamin Leonhardi - This was indeed part of the reason. Thank you very much for your help!

maeve_ryan226 · ‎04-17-2016

@Jitendra Yadav Thanks very much for your help. This issue has been resolved.

maeve_ryan226 · ‎04-14-2016

@Jitendra Yadav Thanks for the quick response! Are these things I can do on the Hortonworks console or do I need to ssh into the instance? I am new to Hadoop so apologies if the above question seems elementary!

maeve_ryan226 · ‎04-14-2016

Hi all, I have a query in relation to Space Allocation within HDFS. I am currently trying to run a large query in Hive (a wordsplit on a large file). However, I am unable to complete this due to running out of disk space. I have deleted any unnessary files from HDFS and have reduced my starting Disk Usage to 38%. However, I am wondering what non DFS is as this appears to be taking up the majority of my disk space. How can I go about reducing the disk space that Non DFS takes up? Any help is greatly appreciated. Thanks in advance.

maeve_ryan226 · ‎03-29-2016

Thanks a lot Benjamin - I did realise after posting the above that I needed a UDF to use with the rank function on its own. It's working now so thank you.

maeve_ryan226 · ‎03-28-2016

Hi all, I have a table with the fields user_id and value and I want to order the values in descending order within each user_id and then only emit the top 100 records for each user_id. This is the code I am attempting to use: DROP TABLE IF EXISTS mytable2 CREATE TABLE mytable2 AS SELECT * FROM (SELECT *, rank (user_id) as rank FROM (SELECT * from mytable DISTRIBUTE BY user_id SORT BY value DESC)a )b WHERE rank<101 ORDER BY rank; However when I run this query, I get the following error: Error while compiling statement: FAILED: SemanticException [Error 10247]: Missing over clause for function : rank [ERROR_STATUS] Can anyone help? Thanks in advance.

maeve_ryan226 · ‎03-27-2016

Thanks Scott - my problem is sorted now!

maeve_ryan226 · ‎03-27-2016

Issue is sorted now - now I finally know how to use putty! Thanks all

Online	Offline
Last Visited	‎04-05-2017 08:47 AM

Member Since	‎03-20-2016 08:58 PM
Last Visited	‎04-05-2017 08:47 AM
Posts	21
Kudos received	5

Cloudera Community

Re: HDFS - Dropping HIVE table is not freeing up m...

HDFS - Dropping HIVE table is not freeing up memor...

Re: HDFS- Non DFS space allocation/capacity

Re: HDFS- Non DFS space allocation/capacity

Re: HDFS- Non DFS space allocation/capacity

HDFS- Non DFS space allocation/capacity

Re: Hive - top n records within a group

Hive - top n records within a group

Re: How to shut down/kill all queries in Hive

Re: How to shut down/kill all queries in Hive