Member since
08-18-2017
72
Posts
11
Kudos Received
14
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1801 | 06-24-2019 01:56 PM | |
777 | 10-13-2018 04:40 AM | |
708 | 10-13-2018 04:24 AM | |
1620 | 10-05-2018 02:59 PM | |
1592 | 10-04-2018 09:06 AM |
09-12-2018
04:06 PM
you can create external table with location & write the text files directly on that path. eg., create external table staging1(id struct<tid:string,action:string,createdts:timestamp>, cid string, anumber string) row format delimited fields terminated by ',' collection items terminated by '|' stored as textfile LOCATION '/tmp/staging/'; All text files can be directly written at /tmp/staging/ by kafka or flume If Kafka or flume will be able to generate Avro files, then you can skip the staging table & create external avro table & write avro files directly on to external table location.
... View more
09-12-2018
03:55 PM
Yes. File content will be # hadoop fs -cat /tmp/data1.txt 1|success|2018-09-12 17:45:39.69,3,12345 Then you need to load the content into staging table using below command load data inpath '/tmp/data1.txt' into table staging; Then from staging, you need to load it into actual avro table using below command insert into testtbl select * from staging; If my answer helped you to resolve your issue, you can accept it. It will be helpful for others.
... View more
09-12-2018
03:09 PM
I assume raw data is in text & u want to convert & load the data into avro tables. If so, u can create another identical text table & specifiy the delimiter in data.. i.e., create table staging(id struct<tid:string,action:string,createdts:timestamp>, cid string, anumber string) row format delimited fields terminated by ',' collection items terminated by '|' stored as textfile; sample text data can be as below 1|success|150987428888,3,12345 insert into testtbl select * from staging; If kafka or flume is generating avro files directly, then those files can be written into table path directly. Its better to create external table if source files are written directly on table path.
... View more
09-12-2018
03:01 PM
Its not possible to use functions in insert into table values statement.
... View more
09-12-2018
01:10 PM
Try below insert statement 0: jdbc:hive2://abcd:10000> with t as (select NAMED_STRUCT('tid','1','action','success', 'createdts',current_timestamp) as id ,'1' as cid,'12345' as anumber)
0: jdbc:hive2://abcd:10000> insert into testtbl select * from t;
No rows affected (20.464 seconds)
0: jdbc:hive2://abcd:10000> select * from testtbl;
+-----------------------------------------------------------------------+--------------+------------------+--+
| testtbl.id | testtbl.cid | testtbl.anumber |
+-----------------------------------------------------------------------+--------------+------------------+--+
| {"tid":"1","action":"success","createdts":"2018-09-12 15:06:27.075"} | 1 | 12345 |
+-----------------------------------------------------------------------+--------------+------------------+--+
... View more
09-11-2018
07:14 AM
Can you check what is the value for tez.runtime.unordered.output.buffer.size-mb ? I think its configured to an higher value.
... View more
08-17-2018
03:01 PM
I am thinking of solution for the jira.. This needs to be implemented in code. There is no config to do this for now.
... View more
08-17-2018
07:09 AM
Concatenation depends on which files are chosen first. The ordering of the files not deterministic with CombineHiveInputFormat, since grouping happens at hadoop layer Concatenation will split or combine files based on orc file size > or < maxSplitSize. for eg.,
say if you have 5 files.. 64MB, 64MB, 64MB, 64MB, 512MB & mapreduce.input.fileinputformat.split.minsize=256mb this can result in 2 files 256MB, 512MB.. or it may result in 3 files 256MB, 256MB, 256MB. I raised a jira for the same Easy solution for this would be to add a path filter to skip files > maxSplitSize.
... View more
08-16-2018
07:10 AM
In beeline or cli, after creating table, u can either do show create or describe to know the table path in hdfs. After exiting from beeline or cli, u can use below command to see the table folder & files inside it hadoop fs -ls -R <tablePath>
... View more
08-01-2018
07:18 AM
This validation is intentionally added in spark with SPARK-15279. As it doesn't make sense to provide DELIMITERS for ORC | PARQUET files.
... View more
07-30-2018
05:37 AM
After create external table with location, can you run "msck repair table data" ? It should automatically update partition information from folder path to hive metadata.
... View more
05-28-2018
04:26 PM
1 Kudo
@cskbhatt, i assume external table location is "hdfs://<emr node>:8020/poc/test_table/" This issue is happening because hdfs://<emr node>:8020/poc/test_table/.metadata/descriptor.properties is not a Parquet file, but exist inside table folder. When Hive ParquetRecordReader tries to read this file, its throwing above exception. Remove all non parquet files from table location & retry your query.
... View more
04-18-2018
06:40 AM
I am glad the problem is resolved. Both 1) http://nexus-private.hortonworks.com/nexus/content/groups/public/ 2) http://repo.hortonworks.com/content/groups/public/ Will resolve this issue as mentioned in my previous link to refer hortonworks public repo. Please accept the answer
... View more
04-17-2018
08:56 AM
@Gaurang Shah
You are using hortonworks.extrepo (http://repo.hortonworks.com/content/repositories/releases, releases=true, snapshots=true), This repo dont have dependent jars, can you please use http://nexus-private.hortonworks.com/nexus/content/groups/public/
This repo has all the required dependencies eg., 1) org.mortbay.jetty:jetty-util:jar:6.1.ca26.hwx http://nexus-private.hortonworks.com/nexus/content/groups/public/org/mortbay/jetty/jetty-util/6.1.26.hwx/
... View more
04-16-2018
07:53 AM
Use hortonworks public repo & try with below maven dependency to compile custom UDF with specific HDP version. <dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>1.2.1000.2.6.4.0-91</version>
</dependency> Hortonworks maven public repo can be referred from this link
... View more
03-05-2018
10:16 AM
This can be related to HIVE-13901. Depending on the FS, MSCK & Add partition can be slow. Can you try Setting "hive.fetch.task.conversion=none" ?
... View more
02-19-2018
09:00 AM
Any fat jar added for UDF ? Can you enable verbose to see which jar is getting loaded ?
... View more
02-19-2018
06:55 AM
One possible reason i could think is you have hive 1.x serde jar in LLAP classpath. ie., In Hive 1.x, return type of ColumnProjectionUtils.getReadColumnNames is List<String> In Hive 2.x, return type of ColumnProjectionUtils.getReadColumnNames is String[] There is a method return type mismatch which is throwing NoSuchMethodError.
... View more
02-16-2018
08:26 AM
AFAIK, Phoenix Handlers are compiled with Hive 1.x dependency, @Sergey Soldatov, are we supporting Phoenix Storage handlers in HS2Interactive / Hive 2.x ?
... View more
02-09-2018
07:23 AM
varchar requires length of the column to be mentioned in hive. Can you try below query create external table default.val_deriv_orc (deriv varchar(10), deriv_descr varchar(15))stored as orcfile; if you are not sure about the max length of the column data, its better to use string datatype. varchar will automatically remove the column data more than specified length. 0: jdbc:hive2://localhost:10000> create external table default.val_deriv_orc (deriv varchar(4) ,deriv_descr varchar(5))stored as orcfile;
0: jdbc:hive2://localhost:10000> insert into default.val_deriv_orc values('abcdef','abcde');
0: jdbc:hive2://localhost:10000> select * from default.val_deriv_orc;
+----------------------+----------------------------+--+
| val_deriv_orc.deriv | val_deriv_orc.deriv_descr |
+----------------------+----------------------------+--+
| abcd | abcde |
+----------------------+----------------------------+--+
... View more
02-01-2018
05:19 AM
You can check for the relevant issue in HIVE Jira
... View more
01-31-2018
12:23 PM
As @gopal said, HS2 (Interactive) has better SQL Support, Hive CLI always use Hive1.x, whereas HS2 (Interactive) use Hive 2.x If you are using beeline, u can connect to both HS2 & HS2 (Interactive) based on url provided. No need to reload the data. If above SQL needs to be supported in HDP 2.6.0 Hive 1.x, then i think you might need HIVE-15211 & HIVE-16885.
... View more
01-31-2018
09:09 AM
@Carlton Patterson pls try the below create statement for acorn_category_frequency table creation Create table acorn_category_frequency
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 'wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/sampleout/'
as
select acorn_category,
count(*) as acorn_categorycount
from geography
group by acorn_category<br>
... View more
01-31-2018
08:56 AM
In secure setup, to view active sessions, open queries, last max history queries tabs, keytab & principal passed for HS2 Interactive UI should be admin user. Admin Users are configured using "hive.users.in.admin.role", spengo prinicipal should be one of the admin users specified in this property. You can refer the link for HS2 Interactive UI configs
... View more
01-31-2018
07:25 AM
@Hanu V This is related to HIVE-12082. This issue exist in Hive1 1) You can either use Hive2 (Interactive HS2) 2) Workaround using explicitly cast to int. select name, greatest(int(sub1),int(sub2),int(sub3)) from students; 3) Use orc format
... View more
01-30-2018
11:24 AM
pls try to use below command for creating HiveSampleIn Table CREATE EXTERNAL TABLE HiveSampleIn
(
anonid INT,
eprofileclass INT,
fueltypes STRING,
acorn_category INT,
acorn_group STRING,
acorn_type INT,
nuts4 STRING,
lacode STRING,
nuts1 STRING,
gspgroup STRING,
ldz STRING,
gas_elec STRING,
gas_tout STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 'wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/samplein/'
tblproperties ("skip.header.line.count"="2");
... View more
01-30-2018
09:41 AM
CREATE EXTERNAL TABLE HiveSampleIn ... LOCATION 'wasb://adfgetstarted@geogstoreacct.blob.core.windows.net/samplein/'; -- Getting terminated at here tblproperties ("skip.header.line.count"="2"); -- This might be throwing ParseException
... View more
12-11-2017
11:10 AM
1 Kudo
you can try below command for updating existing array data If you want to overwrite existing value with new value for the array column, you can use below update statement update test1 set dept=array('marketing', 'FINANCE') where name='a'; If you want to append new value, then you can use below update statement. update test1 set dept=split(concat_ws(',',dept,'FINANCE'),',') where name ='a';
... View more
12-11-2017
10:17 AM
Glad to know that your problem is resolved. Can you please accept this answer.
... View more
12-11-2017
05:25 AM
You can create a data file with delimiters and use load command to load the data into table as mentioned in below example. eg., create table test1(name string, age int, dept array<string>) row format delimited fields terminated by ':' collection items terminated by ',' stored as textfile; Content in the text file should be delimited as follows a:25:sales,marketing b:26:dev,qa ... name(string):age(int):dept(array<string> delimieted with ,) 0: jdbc:hive2://abcd:10000> load data inpath 'file_path' into table test1; 0: jdbc:hive2://abcd:10000> select * from test1; +-----------+----------------+--------------------+--+ | test1.name | test1.age | test1.dept | +-----------+----------------+--------------------+--+ | a | 25 | [sales,marketing] |
| b | 26 | [dev,qa] |
+-----------+----------------+--------------------+--+
... View more
- « Previous
-
- 1
- 2
- Next »