Created on 04-01-2020 02:58 AM - last edited on 04-01-2020 04:55 AM by VidyaSargur
Hi,
I have a Linux box which i configured HDFS client to connect and able to put files in the directories.
Note that the machine is not part of the cluster.
1. Is it possible to run hive queries through hive CLI from this client machine? If so what all configuration i need to to.
2. Need one understanding on pushing file to hdfs. If i am pushing the file to hive table under below location and if it is a clustered ENV. will there be any side effect? As of now mine is a single node cluster so using the below.
hdfs dfs -rm -r /user/hive/warehouse/schema.db/table
hdfs dfs -copyFromLocal -f /scratch/mano/table.dat /user/hive/warehouse/schema.db/table
Thanks in advance.
Manoj
Created 04-01-2020 05:56 AM
Hi Manoj,
Question 1: Yes, it is possible to install client packages in a separate machine and connect to a Hadoop server. Follow the link (https://docs.cloudera.com/documentation/enterprise/5-3-x/topics/cm_mc_client_config.html) which has detailed information about how to download and deploy client configuration file manually.
Question 2: If you push a file from local to HDFS normal HDFS write operation will be performed. Check out the link(https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html) for detailed information
Let's Hadoop
Rajkumar
Created 04-01-2020 05:56 AM
Hi Manoj,
Question 1: Yes, it is possible to install client packages in a separate machine and connect to a Hadoop server. Follow the link (https://docs.cloudera.com/documentation/enterprise/5-3-x/topics/cm_mc_client_config.html) which has detailed information about how to download and deploy client configuration file manually.
Question 2: If you push a file from local to HDFS normal HDFS write operation will be performed. Check out the link(https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html) for detailed information
Let's Hadoop
Rajkumar
Created 04-06-2020 01:22 AM
Hi MRajkumar,
I added the the machine to the cluster,
but interested in knowing if i don't add the machine to cluster and i want to setup client configuration for HDFS I am able to ping and update any file in HDFS but can i configure for hive as well to run hive qurery through command-line like below.
USE schemaName;
LOAD DATA INPATH '${HDFS_LOAD_DIR}/table.csv' OVERWRITE INTO TABLE table;
2nd question is: When i used HDFS client commands and placed the file under the hive warehouse table directory directly the data file i can see the same data when i query on hue, Note that mine is a single node cluster means only one data node is present. What if it is a multi node and many data nodes are present, in that case the file will be replicated in all of them, if i perform the below command will it be propagated? and replicated the same on all the data nodes?
hdfs dfs -rm -r $TABLE_PATH/*
hdfs dfs -copyFromLocal -f $dat/csvfilepath $TABLE_PATH
my $TABLE_PATH is the warehouse directory+schamaname.db+/tablename
$TABLE_PATH=/user/hive/warehouse/schema1.db/table
Thank you.