Member since
05-02-2017
360
Posts
65
Kudos Received
22
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
13351 | 02-20-2018 12:33 PM | |
1507 | 02-19-2018 05:12 AM | |
1862 | 12-28-2017 06:13 AM | |
7141 | 09-28-2017 09:25 AM | |
12181 | 09-25-2017 11:19 AM |
09-15-2017
04:53 PM
@Naveen Dabas It should work actually. Try removing the '`' and execute it. Orelse the other way is to drop the table and re-create it. As it an external table it wont affect the data. Hope it helps!!
... View more
09-15-2017
04:52 PM
@Naveen Dabas It should work actually. Try removing the '`' and execute it. Orelse the other way is to drop the table and re-create it. As it an external table it wont affect the data. Hope it helps!!
... View more
09-15-2017
08:21 AM
1 Kudo
@kenny creed you are using regexp_replace in spark which gives you string datatype. In spark you have to use cast to convert it. Given below an example which might help in solving your problem: Hope it helps! val res = df.select($"id", $"date", unix_timestamp($"date", "yyyy/MM/dd HH:mm:ss").cast(TimestampType).as("timestamp"), current_timestamp(), current_date())
... View more
09-15-2017
06:51 AM
Hope it helps! If so then please accept it as best answer!
... View more
09-15-2017
06:50 AM
1 Kudo
@I1095 Check this blog. You have a detailed comparison between ORC and Parquet. Other than that there are very in terms of use case. But in the future most I believe most of the improvements are being developed based on ORC i believe. 1. Many of the performance improvements provided in the Stinger initiative are dependent on features of the ORC format including block level index for each column. This leads to potentially more efficient I/O allowing Hive to skip reading entire blocks of data if it determines predicate values are not present there. Also the Cost Based Optimizer has the ability to consider column level metadata present in ORC files in order to generate the most efficient graph. 2. ACID transactions are only possible when using ORC as the file format
... View more
09-15-2017
03:59 AM
@n c No we cant get the something similar to DDL of a table in terms of database. But we can use describe database to see the other properties. Hope it helps!
... View more
09-14-2017
11:13 AM
Hi @n c You can use Insert Overwrite Local Directory command in hive to export to the desired format and use distcp to copy the files or even the complete database in hive( which means entire files which are created under each tables in a database) into the second cluster. Once the files are moved to new cluster take the DDL for previous cluster and create the hive tables. Once its done you can either insert/copy the files into hive tables in new cluster. Hope it Helps!!
... View more
09-14-2017
11:06 AM
1 Kudo
Hi @Harjinder Brar concat('{', u.swid ,'}') will concatenate bracs with the value from u.swid. For Example if the value of u.swid is TEST then it will be converted to {TEST} which will be used to join with o.swid column. Hope it Helps!!
... View more
09-13-2017
05:43 AM
Is there any certifications available in Hortonworks for Bigdata/Hadoop Architect? If available the could someone help me with links to check its information.
... View more
Labels:
- Labels:
-
Hortonworks Data Platform (HDP)
09-12-2017
09:47 AM
1 Kudo
Hi @Vijay Parmar Apart from the concatenate option in hive which was mentioned by @Steven O'Neill try using these options below : Depending upon the execution engine first set property differ. You can also modify the size of file in the below option. By these options you can merge the small files based on the input data however it will alter the existing data in the target table but it will be able to able to solve the problem in the future if there are small files being created. set hive.merge.tezfiles=true; -- Notifying that merge step is required
set hive.merge.smallfiles.avgsize=128000000; --128MB
set hive.merge.size.per.task=128000000; -- 128MB
... View more