Member since
08-04-2017
20
Posts
0
Kudos Received
0
Solutions
03-03-2021
11:19 PM
1 Kudo
Yarn resourcemanager keeps writing status of each running/finished application in the statestore. Statestore usually are managed in either zookeeper or in localFS based on our configurations. When the RM turns from standby to active it looks for the latest commits made by the other RM and loads them. If this information is lost at any given point, RM will fail to load the application information.
... View more
04-02-2020
04:46 AM
How to get entire table list in hive(cluster). Not for particular table and database.
... View more
01-23-2020
09:19 AM
Output of my table with two dates. ========================== **** hdfs://*** 29GB 30233MB 1/23/2020 **** hdfs://*** 8GB 9040MB 1/23/2020 **** hdfs://*** 911GB 933122MB 1/23/2020 **** hdfs://*** 29GB 29795MB 1/23/2020 -(MINS) **** hdfs://*** 129GB 130233MB 1/24/2020 **** hdfs://*** 18GB 19040MB 1/24/2020 **** hdfs://*** 1911GB 1933122MB 1/24/2020 **** hdfs://*** 129GB 129795MB 1/24/2020 Output I need like this. **** hdfs://*** 29GB 30233MB **** hdfs://*** 8GB 9040MB **** hdfs://*** 911GB 933122MB **** hdfs://*** 29GB 29795MB I have tired two view creation with join select not working. create view alerts_archive.dbsizevw1 AS select * from alerts_archive.dbsize where date='Jan-23-2020'; create view alerts_archive.dbsizevw2 AS select * from alerts_archive.dbsize where date='Jan-24-2020'; select db_name,location, (a.size_in_mb-b.size_in_mb) as variance_MB, (a.size_in_gb-b.size_in_gb) as variance_GB from alerts_archive.dbsizevw2 a join alerts_archive.dbsizevw1 b; select (a.size_in_mb-b.size_in_mb) as variance_MB, (a.size_in_gb-b.size_in_gb) as variance_GB from alerts_archive.dbsizevw2 a join alerts_archive.dbsizevw1 b;
... View more
04-01-2019
01:47 PM
Hi @RajeshMadurai Ans 1: The exception posted is very generic. Need the complete error message that was seen on the terminal upon running MSCK to come to see what could have gone wrong. Suggestions: By default, Managed tables store their data in HDFS under the path "/user/hive/warehouse/<table_name>" or "/user/hive/warehouse/<db_name>/<table_name>". So if you have created a managed table and loaded the data into some other HDFS path manually i.e., other than "/user/hive/warehouse", the table's metadata will not get refreshed when you do a MSCK REPAIR on it. This could be one of the reasons, when you created the table as external table, the MSCK REPAIR worked as expected. Ans 2: For an unpartitioned table, all the data of the table will be stored in a single directory/folder in HDFS. For example, a table T1 in default database with no partitions will have all its data stored in the HDFS path - "/user/hive/warehouse/T1/" . Even when a MSCK is not executed, the queries against this table will work since the metadata already has the HDFS location details from where the files need to be read. On the other hand, a partitioned table will have multiple directories for each and every partition. If a new partition is added manually by creating the directory and keeping the file in HDFS, a MSCK will be needed to refresh the metadata of the table to let it know about the newly added data. Hope this helps!
... View more
12-21-2018
05:32 AM
1 Kudo
Hello @RajeshMadurai Thank you for posting your update here Do you see any other Special characters (other than $) at the end of filenames in your command's output ? #hadoop fs -ls <path> >/tmp/hdfslist #cat -e /tmp/hdfslist or #cat -v /tmp/hdfslist Also, you can refer the below communty thread http://community.cloudera.com/t5/Storage-Random-Access-HDFS/Duplicate-Directories-in-HDFS/m-p/37319 Hope this helps
... View more
07-03-2017
01:01 PM
Name node services is not in up.
... View more