About rbiswas1

rbiswas1 · ‎01-04-2018

@Zack Riesland You can create a separate table with only the current day's partition. Merge/consolidate the small files into it. Then run an exchange partition into the main table to achieve this. In that way, you do not need to play with the entire data in the main table and this kind of gives you a clean way to achieve this. Create a shell script and put the commands and do the boundary checks like the end of the day, reprocessing etc and you will have an airtight solution. Thanks, Raj

rbiswas1 · ‎08-01-2017

@Sonu Sahi and @Greg Keys There is a bug in Hortonworks > ODBC 2.1.5 This is fixed and will be part of HDP 3.0 As an interim, we can use ODBC driver v2.1.2 from the archive section under HDP 2.4. Thanks

rbiswas1 · ‎07-28-2017

++ non-LLAP works fine with the same setup just different port. Will keep posted.

rbiswas1 · ‎07-28-2017

Am trying to use http not sasl.

rbiswas1 · ‎07-28-2017

yes Sir am sure.

rbiswas1 · ‎07-28-2017

Have you managed to connect hive LLAP via ODBC using http and uid /pwd? If yes, can you provide the option you laid on the http path in the http properties? Using 'cliservice' passes the connection test but it eventually fails with the below error when using the odbc in power bi since LLAp does not support cli. DataSource.Error: ODBC: ERROR [HY000][Hortonworks][Hardy] (35)Errorfrom server: error code:'0' error message: 'MetaException(message:java.lang.NullPointerException)'. Details: DataSourceKind=Odbc DataSourcePath=dsn=Hive_DSN OdbcErrors=Table Connecting to hiveserver2 works in the above-mentioned way Thanks

rbiswas1 · ‎07-26-2017

@Manish Kumar Yadav You probably are looking at Hadoop block size vs split size. Below is a nice read: https://hadoopjournal.wordpress.com/2015/06/30/mapreduce-input-split-versus-hdfs-blocks/

rbiswas1 · ‎07-26-2017

@Harish Nerella You can use usual hive update to update bulk records: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Update Merge is more flexible and a bit different than a plain update. But if you are only performing only updates I do not see any reason not be able to do it using update statement from Hive 0.14.

rbiswas1 · ‎07-21-2017

@PPR Reddy Here goes the solution (You can do it in other ways if you choose to): Column name in table a is c hive> select * from a; OK y y y n hive> Query: hive> select c,per from( > select x.c c,(x.cc/y.ct)*100 per,(z.cn/y.ct)*100 pern from > (select c, count(*) cc from a group by c) x, (select count(*)ct from a) y, > (select c, count(*) cn from a where c='n' group by c) z) f > where pern > 20; Output: OK n 25.0 y 75.0 Thanks

rbiswas1 · ‎07-21-2017

@Laurent lau that equal distribution of replica is not guaranteed. If you think about it in high level it does compromise the speed of writes as well as reads. So not recommended even if you are planning to do it programmatically.

Online	Offline
Last Visited	‎05-03-2018 08:15 PM

Member Since	‎04-04-2016 06:50 PM
Last Visited	‎05-03-2018 08:15 PM
Posts	166
Kudos received	168

Cloudera Community

Re: How to "defragment" hdfs data?

Re: How to connect hive LLAP via ODBC using http a...

Re: which time actaul block size assign ? Is it pr...

Re: Hive - i would like to calculate percentage of...

Re: Get the length of time an oozie workflow took ...

Re: How to "defragment" hdfs data?

Re: How to connect hive LLAP via ODBC using http a...

Re: How to connect hive LLAP via ODBC using http a...

Re: How to connect hive LLAP via ODBC using http a...

Re: How to connect hive LLAP via ODBC using http a...

How to connect hive LLAP via ODBC using http and u...

Re: which time actaul block size assign ? Is it pr...

Re: How to update bulk records in Hive without usi...

Re: Hive - i would like to calculate percentage of...

Re: HDFS resiliency - DR - rack aware