This article will address some of the most least documented issues you may find while importing data using oozie-sqoop-hcatalog and respective solutions.
Heart beat, follow by read-timed-out with thrift server
SYMPTOM
While trying to execute an oozie workflow, on the stdout of the oozie sqoop job you can see:
hive.metastore (HiveMetaStoreClient.java:open(382)) - Trying to connect to metastore with URI thrift://host:port
Heart beat
Heart beat
Heart beat
ERROR [main] hive.log (MetaStoreUtils.java:logAndThrowMetaException(1221)) - Got exception: org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: Read timed out
org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
ROOT CAUSE
Missing hive-site.xml for the sqoop action caused the error
RESOLUTION
Save the hive-site.xml on the hdfs and reference from the sqoop action file tag:
Import failed: Can not create a Path from an empty string
SYMPTOM
While trying to execute an oozie workflow, on the stdout of the oozie sqoop job you can see:
ERROR org.apache.sqoop.tool.ImportTool - Imported Failed: Can not create a Path from an empty string
ERROR [main] tool.ImportTool (ImportTool.java:run(607)) - Imported Failed: Can not create a Path from an empty string
ROOT CAUSE
Missing skip-dist-cache argument for the sqoop action
While trying to execute an oozie workflow, on the stdout of the oozie sqoop job you can see:
4591 [main] ERROR org.apache.sqoop.Sqoop - Got exception running Sqoop: java.lang.NullPointerException ERROR [main] sqoop.Sqoop (Sqoop.java:runSqoop(186)) - Got exception running Sqoop: java.lang.NullPointerException
Intercepting System.exit(1)
and on the stderr you can see:
java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at java.lang.Runtime.exec(Runtime.java:620)
at java.lang.Runtime.exec(Runtime.java:528)
at org.apache.sqoop.util.Executor.exec(Executor.java:76)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.executeExternalHCatProgram(SqoopHCatUtilities.java:1145)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.launchHCatCli(SqoopHCatUtilities.java:109
ROOT CAUSE
NPE is happening because HCAT_HOME is not set when running Sqoop import through Oozie for hcatalog.
RESOLUTION
To fix this issue, please set the hcatalog-home in Sqoop import workflow.xml: