Member since
11-02-2017
51
Posts
6
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
247 | 09-22-2022 04:16 AM | |
2044 | 03-09-2018 02:34 PM | |
10886 | 02-01-2018 06:15 AM | |
3857 | 11-13-2017 12:34 PM |
09-22-2022
04:16 AM
@yagoaparecidoti It looks like this particular user does not have permission to connect to HMS. You can add this user or put "*" under this configuration : CM->Hive->Configuration->Search "hive_proxy_user_groups_list" And then restart hive and then run "show databases" . Another possibility could be that it is not able to locate hive-site.xml on the node required to connect to Hive Metastore.
... View more
04-12-2018
01:19 PM
@Gaurang Shah Can you please verify if you have proper permissions on '/tmp/test_special_char/' directory.
... View more
04-12-2018
12:52 PM
@Guillaume Roger This error says you already have an existing jar with same name in the classpath. Can you delete the old jar from classpath before adding the new one . Please refer HiveResources for delete jar commands.
... View more
03-26-2018
08:40 AM
@pk reddy
can you share values for hive.tez.container.size and hive.tez.java.opts, tez.runtime.io.sort.mb, tez.runtime.unordered.output.buffer.size-mb.
... View more
03-19-2018
05:39 PM
@Ashish Wadnerkar You can grep hiveserver2.log with "Parsing command" string and you will get complete hive query.
... View more
03-17-2018
03:31 PM
@Sooraj Antony As you have scenarios for skew data in the joining column, enable skew join optimization. set hive.optimize.skewjoin=true set hive.skewjoin.key=5000 you can tune it further with number of mapper tasks and split size by hive.skewjoin.mapjoin.map.tasks and hive.skewjoin.mapjoin.min.split properties.
... View more
03-16-2018
04:20 AM
@Girish Jaiswal Here, regex is used as one of the options to process CSV data since in the first step, complete record as a string in a staging table. you can also directly load csv file into target table (without using any staging table) using comma delimiter.
... View more
03-15-2018
05:21 PM
You can use to validate option with your import and export command to validate number of rows between source and target tables (works only for single table). If you want to validate every single value,you can compare checksums of the original and sqooped data in that source system.
... View more
03-12-2018
05:56 PM
@Sam Red can you try putting quotes around your connection string or try connecting through beeline prompt. 1. Enter beeline 2. !connect jdbc:hive2://serverip:10000
... View more
03-12-2018
04:35 PM
@srini You can import xml directly using com.ibm.spss.hive.serde2.xml.XmlSerDe which is detailed here: https://community.hortonworks.com/articles/972/hive-and-xml-pasring.html Other option is to load entire record in a string and then access it using xpath UDF : 1. create table employee( employee_info string) ; 2. load data local inpath '/home/hduser/sample.xml' into table employee; 3. create view employee_xml_view as SELECT xpath_int(employee_info ,'code/root/root1/id'),xpath_string(employee_info ,'code/root/root1/joiningdate'),xpath_string(employee_info ,'code/root/root1/joiningdate').................. from employee; 4. Select * from employee_xml_view;
... View more
03-09-2018
03:28 PM
@Guillaume Roger Heap size should be 80% of tez container size and io sort mb should be 40% . Can you verify below configurations : set hive.tez.java.opts=-Xmx3276m; set tez.runtime.io.sort.mb=1638; Can you try disabling map side join before executing the query - set hive.auto.convert.join=false;
... View more
03-09-2018
02:34 PM
@Santanu Ghosh During Sqoop export, Staging table should be identical in structure to the destination table. I see in your tables datatype of third column is different. Can you update and then try.
... View more
03-08-2018
03:11 PM
Did you find any details in logs ?
... View more
03-08-2018
01:12 PM
@Satish Anjaneyappa You would get more details in hive logs, suggesting why the move operation failed.
... View more
03-08-2018
01:00 PM
@Satish Anjaneyappa This error occurs when map-reduce/ tez job is completed and then data is moved from staging directory to the destination directory. Error while moving could occur due to permission issues or if there is space issue in case of huge table. Can you check the permissions or if you have space quota set at the directory level.
... View more
03-08-2018
03:49 AM
@Dmitro Vasilenko Can you try this query without map join . Looks like data size is huge for hybrid grace hash join. set hive.auto.convert.join=false; And then execute this query.
... View more
03-07-2018
02:21 PM
@Dmitro VasilenkoThis is a known yarn issue: https://issues.apache.org/jira/browse/YARN-952 Please refer this article : https://community.hortonworks.com/content/supportkb/151825/errortrying-to-reinitialize-roothive-from-roothive.html
... View more
03-07-2018
02:19 PM
@Dmitro Vasilenko I see frequent full GC in logs. Can you share yarn application log and the query that you executed? Also, share explain plan for the query.
... View more
03-06-2018
09:13 AM
@tomoya yoshida It seems you have increased hive.tez.container.size but yarn.scheduler.maximum-allocation-mb is still set to 2250 MB. Can you try increasing yarn.scheduler.maximum-allocation-mb to more than 3750 MB. you need to accordingly set the value of this yarn.nodemanager.resource.memory-mb property. Let me know if it doesn't work after changing this.
... View more
03-06-2018
08:53 AM
@Bharath Kumar K Looks like your ambari-server is not running. Can you login to ambari master node and run below command - ambari-server status If it is not running, please start ambari server. Also, check status for all ambari agents.
... View more
03-06-2018
07:55 AM
1 Kudo
@rmr1989 RM does not choose NMs to launch containers. It only launches the first container which is ApplicationMaster for the submitted application. RM also informs AM about the minimum and maximum resource capabilities of the cluster. Then, it is Application Master's job to negotiate for resources. It will send a resource request to RM with a specfic number of containers on a specific node. A ResourceRequest has the following form: <resource-name, priority, resource-requirement, number-of-containers> These components are described as follows: resource-name is either hostname, rackname where the resource is desired, or * to indicate no preference. priority is an intra-application priority for this request (not across multiple applications).This orders various ResourceRequests within a given application. resource-requirement is the required capabilities such as the amount of memory or CPU time (currently YARN supports only memory and CPU). number-of-containers is just a multiple of such containers. It limits the total number of containers as specified in the ResourceRequest. Hope this helps !
... View more
03-06-2018
05:31 AM
1 Kudo
@shyam gurram Great, that it worked ! This issue was only for 1 or few partitions, that's the reason limit 10 worked. Can you please accept and upvote the answer.
... View more
03-06-2018
03:44 AM
@VENKATESH M You can try escaping the delimiter in your dataset and then populate the table. CREATE TABLE books (author string, title string, genre string, price double, publish_date string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ESCAPED BY '\\' LINES TERMINATED BY '\n’ STORED AS TEXTFILE;
... View more
03-06-2018
03:17 AM
@Akshat Mathur
Can you share the complete query and desc extended for the table.
... View more
03-05-2018
07:13 AM
2 Kudos
@shyam gurram
Can you check the SerdeInfo for the partitions under this table udasv_XXX.udas_dtv_XXXX_XXX using below command: desc formatted udasv_XXX.udas_dtv_XXXX_XXX partition (name=value); It looks like serde lib is missing from some partitions. Can you compare it with partitions in another environment where it worked fine.
... View more
02-01-2018
03:10 PM
Cool ! Please accept the answer. You can execute below command before your select query to get column headers. <code>set hive.cli.print.header=true;
... View more
02-01-2018
06:15 AM
@Carlton Patterson
There is an extra semicolon before TBLPROPERTIES , removing it will solve your problem. Use below script: DROP TABLE IF EXISTS HiveSampleIn;
CREATE EXTERNAL TABLE HiveSampleIn
(
anonid int,
eprofileclass int,
fueltypes STRING,
acorn_category int,
acorn_group STRING,
acorn_type int,
nuts4 STRING,
lacode STRING,
nuts1 STRING,
gspgroup STRING,
ldz STRING,
gas_elec STRING,
gas_tout STRING
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION 'wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/samplein/'
TBLPROPERTIES ("skip.header.line.count" = "1"); DROP TABLE IF EXISTS HiveSampleOut;
CREATE EXTERNAL TABLE HiveSampleOut
(
acorn_category int,
acorn_categorycount int )
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION 'wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/sampleout/'; INSERT OVERWRITE TABLE HiveSampleOut
Select
acorn_category,
count(*) as acorn_categorycount
FROM HiveSampleIn Group by acorn_category;
... View more
01-31-2018
08:12 AM
@Adithya Sajjanam Can you send the complete command you executed with error message.
... View more
01-31-2018
08:09 AM
Create your table as : CREATE EXTERNAL TABLE HiveSampleOut ( acorn_category int, acorn_categorycount int ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION 'wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/sampleout/'; Or directly create table on the fly while inserting data like: Create table HiveSampleOut As Select acorn_category, count(*)as acorn_categorycount FROM HiveSampleInGroupby acorn_category ;
... View more
01-31-2018
04:19 AM
@Sam Cse
There is an already existing partition at this location hdfs://....../table_name_1/part_col_1=1 or may be a sub-directory and before loading the new partition, cleanup of destination directory is failing. Can you try deleting those files manually and then try Insert Overwrite. Also, share the listing for this hdfs path.
... View more