I've built a database in Impala and I need to sqoop the data into HDFS to continue on in our Oozie Workflow. I've done some research and can't seem to find anything definitive regarding whether or not this is possible. There is alot of detail regarding how to accomplish this for various other databases via links like this and the sqoop documentation but Impala is just nowhere to be found. Can someone confirm?
Made some progress and attempted to sqoop with a Kerberized jdbc connection but received the following:
Error executing statement: java.sql.SQLException: [Simba][JDBC](11975) Unsupported transaction isolation level: 2.
java.sql.SQLException: [Simba][JDBC](11975) Unsupported transaction isolation level: 2.
I've adjusted my sqoop statement to include the --relaxed-isolation option but receiving the same error. Research on the subject seems to suggest some kind of sqoop code customization is required but there is a dearth of information on the subject. I'm not sure if this is feasible. If anyone has information on this, please let me know.
What I'm trying to do is run regression test cases against an Impala table with some fabricated data. We'd usually be sqooping tables out in the wild and storing that info in HDFS for our MapReduce jobs but we have no means of controlling what data those tables contain. I'm hoping I can sqoop this Impala table with it's known quantity of data to verify our app is functioning correctly.