Member since
08-04-2017
20
Posts
0
Kudos Received
0
Solutions
04-02-2020
04:46 AM
How to get entire table list in hive(cluster). Not for particular table and database.
... View more
01-23-2020
09:19 AM
Output of my table with two dates. ========================== **** hdfs://*** 29GB 30233MB 1/23/2020 **** hdfs://*** 8GB 9040MB 1/23/2020 **** hdfs://*** 911GB 933122MB 1/23/2020 **** hdfs://*** 29GB 29795MB 1/23/2020 -(MINS) **** hdfs://*** 129GB 130233MB 1/24/2020 **** hdfs://*** 18GB 19040MB 1/24/2020 **** hdfs://*** 1911GB 1933122MB 1/24/2020 **** hdfs://*** 129GB 129795MB 1/24/2020 Output I need like this. **** hdfs://*** 29GB 30233MB **** hdfs://*** 8GB 9040MB **** hdfs://*** 911GB 933122MB **** hdfs://*** 29GB 29795MB I have tired two view creation with join select not working. create view alerts_archive.dbsizevw1 AS select * from alerts_archive.dbsize where date='Jan-23-2020'; create view alerts_archive.dbsizevw2 AS select * from alerts_archive.dbsize where date='Jan-24-2020'; select db_name,location, (a.size_in_mb-b.size_in_mb) as variance_MB, (a.size_in_gb-b.size_in_gb) as variance_GB from alerts_archive.dbsizevw2 a join alerts_archive.dbsizevw1 b; select (a.size_in_mb-b.size_in_mb) as variance_MB, (a.size_in_gb-b.size_in_gb) as variance_GB from alerts_archive.dbsizevw2 a join alerts_archive.dbsizevw1 b;
... View more
01-23-2020
09:09 AM
Below is my table description.
hive> describe dbsize; OK db_name string location string size_in_mb string size_in_gb string date string
# Partition Information # col_name data_type comment
date string Time taken: 0.335 seconds, Fetched: 10 row(s)
Have to mins the column size_in_mb with different date in where condition.
Having sample sql query for rdbms environment.
select DB_NAME, LOCATION, c2-c1 "difference size in MB", c4-c3 "difference size in GB" from dbsize (select size_in_mb c1 from dbsize where date='01-23-2020'), (select size_in_mb c2 from dbsize where date='01-25-2020'), (select size_in_gb c3 from dbsize where date='01-23-2020'), (select size_in_gb c4 from dbsize where date='01-25-2020');
Need help on hive hql query to mins the columns for two different dates. In same single table.
Please advise me.
... View more
- Tags:
- Hive
Labels:
- Labels:
-
Apache Hive
04-22-2019
09:52 PM
hdfs dfs -getfacl -R / Will display all the sub-directories and files. Particular i need to check the /user directory alone. Eg); hdfs dfs -getfacl /user /user/aaa /user/bbb /user/ccc /user/ddd For checking individual above four directories it possible any other way?
... View more
03-22-2019
09:50 PM
hadoop command getfacal to get output for all sub directories of root location. EG: hdfs dfs -getfacl / or hdfs dfs -getfacl /user
... View more
02-21-2019
11:49 AM
In non-partition table having multiple files in table location. When select statement triggered it worked. How it fetch the data where else without running msck repair command?
... View more
02-13-2019
02:39 AM
We have taken backup one of the production database data and moved it to development local filesystem. In development movied data from local mountpoint to hive database hdfs location.
Question1: Hive msck repair in managed partition table failed with below error message. hive> msck repair table testsb.xxx_bk1; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask What does exception means.
After dropping the table and re-create the table in external type. it worked successfully. hive> use testsb; OK Time taken: 0.032 seconds hive> msck repair table XXX_bk1; xxx_bk1:payloc=YYYY/client_key=MISSDC/trxdate=20140109 . . Repair: Added partition to metastore xxx_bk1:payloc=0002/client_key=MISSDC/trxdate=20110105 . . Time taken: 16347.793 seconds, Fetched: 94156 row(s)
Can you please confirm why it not worked in managed table?
Question:2. Where else select * from table; query able to fetch in non-partition table. Why? We have done testsb database creation and Table creation with ddl script. And moved the data from local to hdfs hive table location.
... View more
12-21-2018
05:13 AM
How the directories are created ? Is it by some script / code ? Yes through the shell script executed two mins(1st time given hive prompt wrongly). But any second time it should exists file already available. #cat -e /tmp/hdfslist -you could see the end of each file name, also you can see any other chars if present other than $ sign After taking the output from above command. Have to remove the duplicate files hdfs dfs -rm rajesh/int_rajeshlake/Retirement/RS_Raw$ Please confirm
... View more
11-28-2018
02:47 AM
#1. Below hdfs location having duplicate directories: ====================================================== hdfs dfs -ls /rajesh/int_datalake/Retirement/ drwxrwxr-x+ - root supergroup 0 2018-10-29 03:24 /rajesh/int_datalake/Retirement/RS_Access drwxrwxr-x+ - root supergroup 0 2018-10-29 03:17 /rajesh/int_datalake/Retirement/RS_Access drwxrwxr-x+ - root supergroup 0 2018-10-29 03:17 /rajesh/int_datalake/Retirement/RS_Raw drwxrwxr-x+ - root supergroup 0 2018-11-27 01:35 /rajesh/int_datalake/Retirement/RS_Raw1 drwxrwxrwx+ - root supergroup 0 2018-11-27 01:39 /rajesh/int_datalake/Retirement/RS_Raw_bk drwxrwxr-x+ - root supergroup 0 2018-10-29 03:24 /rajesh/int_datalake/Retirement/RS_Repos drwxrwxr-x+ - root supergroup 0 2018-10-29 03:17 /rajesh/int_datalake/Retirement/RS_Repos drwxrwxr-x+ - root supergroup 0 2018-10-29 03:24 /rajesh/int_datalake/Retirement/RS_Stage drwxrwxr-x+ - root supergroup 0 2018-10-29 03:17 /rajesh/int_datalake/Retirement/RS_Stage drwxrwxr-x+ - root supergroup 0 2018-10-29 03:24 /rajesh/int_datalake/Retirement/RS_Work drwxrwxr-x+ - root supergroup 0 2018-10-29 03:17 /rajesh/int_datalake/Retirement/RS_Work First we moved the file to bk name and tired to remove but not worked [root@ ]# [root@ ]# hdfs dfs -mv /rajesh/int_rajeshlake/Retirement/RS_Raw /rajesh/int_rajeshlake/Retirement/RS_Raw_bk [root@ ]# hdfs dfs -rm -r /rajesh/int_datalake/Retirement/RS_Raw rm: `/rajesh/int_datalake/Retirement/RS_Raw': No such file or directory hdfs dfs -mv /rajesh/int_datalake/Retirement/RS_Raw /rajesh/int_datalake/Retirement/RS_Raw123 mv: `/rajesh/int_datalake/Retirement/RS_Raw': No such file or directory #2. Finally removed directory rmdir below command and recreated the directory: ================================================================================ hdfs dfs -rmdir /rajesh/int_rajeshlake/Retirement/RS_Wor* hdfs dfs -ls /rajesh/int_rajeshlake/Retirement/ Found 9 items drwxrwxr-x+ - root supergroup 0 2018-10-29 03:24 /rajesh/int_rajeshlake/Retirement/RS_Access drwxrwxr-x+ - root supergroup 0 2018-10-29 03:17 /rajesh/int_rajeshlake/Retirement/RS_Access drwxrwxrwx+ - root supergroup 0 2018-11-27 01:39 /rajesh/int_rajeshlake/Retirement/RS_Raw drwxrwxr-x+ - root supergroup 0 2018-10-29 03:17 /rajesh/int_rajeshlake/Retirement/RS_Raw drwxrwxr-x+ - root supergroup 0 2018-11-27 01:35 /rajesh/int_rajeshlake/Retirement/RS_Raw1 drwxrwxr-x+ - root supergroup 0 2018-10-29 03:24 /rajesh/int_rajeshlake/Retirement/RS_Repos drwxrwxr-x+ - root supergroup 0 2018-10-29 03:17 /rajesh/int_rajeshlake/Retirement/RS_Repos drwxrwxr-x+ - root supergroup 0 2018-10-29 03:24 /rajesh/int_rajeshlake/Retirement/RS_Stage drwxrwxr-x+ - root supergroup 0 2018-10-29 03:17 /rajesh/int_rajeshlake/Retirement/RS_Stage #3. Tried mkdir duplicat manually, Command executed not error directory already exists. But not no duplication created: ======================================================================================================================= hdfs dfs -ls /rajesh/data/Corporate drwxrwxr-x - root supergroup 0 2018-11-27 04:23 /rajesh/data/Corporate/Corp_Access drwxrwxr-x - root supergroup 0 2018-11-27 04:24 /rajesh/data/Corporate/Corp_Raw drwxrwxr-x - root supergroup 0 2018-11-27 04:25 /rajesh/data/Corporate/Corp_Repos drwxrwxr-x - root supergroup 0 2018-11-27 04:25 /rajesh/data/Corporate/Corp_Stage drwxrwxr-x - root supergroup 0 2018-11-27 04:26 /rajesh/data/Corporate/Corp_Work hdfs dfs -mkdir -p /rajesh/data/Corporate/Corp_Work hdfs dfs -ls /rajesh/data/Corporate drwxrwxr-x - root supergroup 0 2018-11-27 04:23 /rajesh/data/Corporate/Corp_Access drwxrwxr-x - root supergroup 0 2018-11-27 04:24 /rajesh/data/Corporate/Corp_Raw drwxrwxr-x - root supergroup 0 2018-11-27 04:25 /rajesh/data/Corporate/Corp_Repos drwxrwxr-x - root supergroup 0 2018-11-27 04:25 /rajesh/data/Corporate/Corp_Stage drwxrwxr-x - root supergroup 0 2018-11-27 04:26 /rajesh/data/Corporate/Corp_Work Please confirm how the duplicate directories created? Directores are in empty file so we remove and recreated the file. If any table or data available what steps have to follow?
... View more
Labels:
- Labels:
-
Cloudera Manager
-
HDFS
10-07-2018
10:01 PM
While doing the manual failover in resource manager my schduled and running application id's will it move to stanby resource manager buy how? In hdfs name node journal nodes are monitoring the edit logs. In resource manager which daemon is monitoring?
... View more
10-07-2018
08:48 PM
In rdbms database block size is 8kb and in hadoop block size is 64MB. In sqoop import example my rdbms tables size is 300mb. So it will split into 5 mapper ? Please confirm
... View more
09-26-2018
02:44 AM
In sqoop import how mapreduce works in key & value pair in rdbms tables with structure data?
Please explain.
... View more
08-04-2017
05:31 AM
File transfter from one cluster to another cluster using distcp, Configured both cluster services in hdfs-site.xml file after that name node service not starting. Starting hdfs service first node name node service getting down and other HA node datanode service getting start. 1.Added HA clusters multiple nameservices in hdfs-site.xml file 2. Stop and Start hadoop all services 3. Namenode service is down below error message. org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/rajesh/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible. 4. CHecked cluster service active or standby it failed. [rajesh@mashdp ~]$ hdfs haadmin -getServiceState nn1 Illegal argument: Unable to determine the nameservice id. [rajesh@mashdp ~]$ hdfs haadmin -getServiceState nn2 Illegal argument: Unable to determine the nameservice id. [rajesh@mashdp ~]$ hdfs haadmin -getServiceState mn1 Illegal argument: Unable to determine the nameservice id. [rajesh@mashdp ~]$ hdfs haadmin -getServiceState mn2 Illegal argument: Unable to determine the nameservice id. 5. hdfs command also down. [rajesh@mashdp ~]$ hdfs dfs -ls -R / 17/06/29 16:31:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/06/29 16:31:16 WARN hdfs.DFSUtil: Namenode for raccluster remains unresolved for ID mn1. Check your hdfs-site.xml file to ensure namenodes are configured properly. 17/06/29 16:31:16 WARN hdfs.DFSUtil: Namenode for raccluster remains unresolved for ID mn2. Check your hdfs-site.xml file to ensure namenodes are configured properly.
... View more
Labels:
- Labels:
-
HDFS
07-03-2017
01:01 PM
Name node services is not in up.
... View more
07-03-2017
12:14 PM
hdfs-site.xml
... View more
06-29-2017
02:00 PM
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/rajesh/app/bigdata/data/namenode</value>
</property>
<property>
<name> dfs.datanode.data.dir</name>
<value>/rajesh/app/bigdata/data/datanode</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster,raccluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>mashdp:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>slave2:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>mashdp:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>slave2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://mashdp:8485;slave2:8485;hpeco:8485/mycluster</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>mashdp:2181,slave2:2181,hpeco:2181</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/rajesh/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>3000</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.raccluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.namenodes.raccluster</name>
<value>mn1,mn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.raccluster.mn1</name>
<value>namnod1:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.raccluster.mn2</name>
<value>namnod2:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.raccluster.mn1</name>
<value>namnod1:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.raccluster.mn2</name>
<value>namnod2:50070</value>
</property>
</configuration>
... View more
06-29-2017
02:00 PM
Trying file transfer from one cluster to another cluster in HA environment. 1.Added HA clusters multiple nameservices in hdfs-site.xml file 2. Stop and Start hadoop all services 3. Namenode service is down below error message. org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/rajesh/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible. 4. CHecked cluster service active or standby it failed. [rajesh@mashdp ~]$ hdfs haadmin -getServiceState nn1
Illegal argument: Unable to determine the nameservice id.
[rajesh@mashdp ~]$ hdfs haadmin -getServiceState nn2
Illegal argument: Unable to determine the nameservice id.
[rajesh@mashdp ~]$ hdfs haadmin -getServiceState mn1
Illegal argument: Unable to determine the nameservice id.
[rajesh@mashdp ~]$ hdfs haadmin -getServiceState mn2
Illegal argument: Unable to determine the nameservice id. 5. hdfs command also down. [rajesh@mashdp ~]$ hdfs dfs -ls -R /
17/06/29 16:31:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/06/29 16:31:16 WARN hdfs.DFSUtil: Namenode for raccluster remains unresolved for ID mn1. Check your hdfs-site.xml file to ensure namenodes are configured properly.
17/06/29 16:31:16 WARN hdfs.DFSUtil: Namenode for raccluster remains unresolved for ID mn2. Check your hdfs-site.xml file to ensure namenodes are configured properly.
... View more
Labels:
- Labels:
-
Apache Hadoop
06-28-2017
12:29 PM
Configured hdfs-site.xml file both cluster HA node. When I starting start-all.sh it starting other second cluster services after that namenode getting down.
... View more