Member since
03-22-2016
27
Posts
9
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3550 | 09-02-2016 05:00 PM | |
2650 | 08-16-2016 06:58 AM | |
4821 | 06-08-2016 12:19 PM |
06-28-2017
02:33 PM
@Sami Ahmad For some tables hive just looks at the table metadata and fetches the values which might not have been updated. There are two ways to approach this. 1. You can run ANALYZE TABLE pa_lane_txn and then run select count(*) statement. This will give you the correct value. 2. You can force hive to run a mapreduce job to count the number of rows by setting fetch task conversion to none; hive> set hive.fetch.task.conversion=none;
... View more
06-28-2017
04:12 AM
2 Kudos
@Anishkumar Valsalam You can run the command desc database <db_name>. There you can find the location of hdfs directory where the db exists and then navigate to find the creation date. From hdfs you can run this command to find the creation date hdfs dfs -stat /apps/hive/warehouse/<db_name.db>
... View more
06-27-2017
02:27 PM
1 Kudo
@PJ Even after setting replication factor as 1 the data would be split into blocks and would be distributed across different datanodes. So, incase of a datanode failure you will only be able to partially retrieve data. Other advantage of setting replication factor > 1 is parallel processing, i.e. you have multiple copies of data at multiple places and all the machines can simultaneously process data.
... View more
06-27-2017
12:35 PM
@Vishal Gupta You might not have added principals for kadmin/fqdn@DOMAIN as well as the legacy fallback kadmin/admin@DOMAIN. You can add them using kadmin.local https://web.mit.edu/kerberos/krb5-1.13/doc/admin/admin_commands/kadmin_local.html
... View more
06-27-2017
05:51 AM
@Leenurs Quadras Hive installation is independent of NameNode/Secondary NameNode location. In the configuration file you just need to specify where is Hadoop installed so that it can access the job tracker for submitting Map Reduce programs. Theoretically you can setup HiveServer, MetaStore Server, hive clients, etc. all in the Master Node. However in a production scenario placing them in a Master or Slave node is not a good idea. You should set up hiveserver on a dedicated Node so it doesn't compete for resources against the namenode processes (It gets quite busy in a multi-user scenario). MetaStore is a separate daemon which can be either embedded with the HiveServer (in this case it uses derby and isn't ideal for a production use) or can be setup separately as a dedicated database service(the recommended way). Beeline (hive client) can run in embedded mode as well as remotely. Remote HiveServer2 mode is recommended for production use, as it is more secure and doesn't require direct HDFS/metastore access to be granted for users. Hope this answers all your questions.
... View more
09-02-2016
05:00 PM
1 Kudo
This is a known issue. Please refer to this link http://gethue.com/hadoop-tutorial-oozie-workflow-credentials-with-a-hive-action-with-kerberos/
... View more
08-16-2016
08:55 PM
you can change the ownership of that file by logging in as the user 'hdfs'
... View more
08-16-2016
06:58 AM
try this as /usr/lib/hue/build/env/bin/hue passwd <username> run this command as root
... View more
06-08-2016
12:19 PM
1 Kudo
You can use your VM bridged adapter (Settings->Network->Adapter 2->Enable Network Adapter and set Attached To Bridged Adapter). You can do a ifconfig and connect to this network (in my case it is 192.168.xxx.xxx) . You can also use CLI to submit a oozie job. For eg: oozie job -oozie http://localhost:8080/oozie -config job.properties -run
... View more