Support Questions

HJ · ‎07-10-2018

I have a kudu table with more than a million records, i have been asked to do some query performance test through both impala-shell and also java. Through impala-shell i am able to perform all the queries and it gives me the timestamp through which i get to know the time taken for running the query,but whne it comes to perfrom queries using java api kudu,is it possible if i can perform join query operation using java api kudu and benchmark the logs

mpercy · ‎07-10-2018

Hi HJ,

It is not possible to do a join using the native Kudu NoSQL API. You will need to use SQL with Impala or Spark SQL, or using the Spark data frame APIs to do the join.

Mike

View solution in original post

mpercy · ‎07-13-2018

The only way I know of to do complex queries through Java is to use the Impala JDBC connector, which you can find here: https://www.cloudera.com/downloads/connectors/impala/jdbc/2-6-3.html

View solution in original post

mpercy · ‎07-16-2018

If you have the data in Oracle I would suggest writing it to Parquet on HDFS using Sqoop first. After that, you will be able to transfer the data to Kudu using Impala with a command like CREATE TABLE kudu_table STORED AS KUDU AS SELECT * FROM parquet_table;

View solution in original post

mpercy · ‎07-16-2018

However I see you wrote a separate forum post which is good, we try to stick with one topic per thread in the forums.

HJ · ‎07-17-2018

thank you again for the solution.I found a java code which loads data into kudu table from csv. is it efficient.?

Cloudera Community

Support Questions

Apache kudu