Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Atlas Sqoop lineage with Hive is not working

Solved Go to solution

Atlas Sqoop lineage with Hive is not working

Contributor

Hi Team,

We are using HDP-2.6.5. Using given doc we are configuring Sqoop and Hive lineage: https://hortonworks.com/tutorial/cross-component-lineage-with-apache-atlas-across-apache-sqoop-hive-...

While running sqoop import, we are getting below ClassNotFoundException :

sqoop import --connect jdbc:mysql://vc-hdp-db001a.hdp.test.com/test --table test_table_sqoop1 --hive-import --hive-table test_hive_table4 --username root -P -m 1 --fetch-size 1
Warning: /usr/hdp/2.6.5.0-292/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/12/17 05:50:21 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.6.5.0-292
Enter password:
18/12/17 05:50:28 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
18/12/17 05:50:28 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
18/12/17 05:50:28 INFO manager.MySQLManager: Argument '--fetch-size 1' will probably get ignored by MySQL JDBC driver.
18/12/17 05:50:28 INFO tool.CodeGenTool: Beginning code generation
18/12/17 05:50:28 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `test_table_sqoop1` AS t LIMIT 1
18/12/17 05:50:28 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `test_table_sqoop1` AS t LIMIT 1
18/12/17 05:50:28 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.6.5.0-292/hadoop-mapreduce
Note: /tmp/sqoop-hdfs/compile/90ee7535be590b2e48c64709e9c0127d/test_table_sqoop1.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/12/17 05:50:29 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hdfs/compile/90ee7535be590b2e48c64709e9c0127d/test_table_sqoop1.jar
18/12/17 05:50:29 WARN manager.MySQLManager: It looks like you are importing from mysql.
18/12/17 05:50:29 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
18/12/17 05:50:29 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
18/12/17 05:50:29 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
18/12/17 05:50:29 INFO mapreduce.ImportJobBase: Beginning import of test_table_sqoop1
18/12/17 05:50:30 INFO client.AHSProxy: Connecting to Application History server at p-hdp-m-r08-02.hdp.test.com/10.10.33.22:10200
18/12/17 05:50:30 INFO client.RequestHedgingRMFailoverProxyProvider: Looking for the active RM in [rm1, rm2]...
18/12/17 05:50:30 INFO client.RequestHedgingRMFailoverProxyProvider: Found active RM [rm1]
18/12/17 05:50:31 INFO db.DBInputFormat: Using read commited transaction isolation
18/12/17 05:50:31 INFO mapreduce.JobSubmitter: number of splits:1
18/12/17 05:50:31 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1544603908449_0008
18/12/17 05:50:32 INFO impl.YarnClientImpl: Submitted application application_1544603908449_0008
18/12/17 05:50:32 INFO mapreduce.Job: The url to track the job: http://p-hdp-m-r09-01.hdp.test.com:8088/proxy/application_1544603908449_0008/
18/12/17 05:50:32 INFO mapreduce.Job: Running job: job_1544603908449_0008
18/12/17 05:50:40 INFO mapreduce.Job: Job job_1544603908449_0008 running in uber mode : false
18/12/17 05:50:40 INFO mapreduce.Job:  map 0% reduce 0%
18/12/17 05:50:48 INFO mapreduce.Job:  map 100% reduce 0%
18/12/17 05:50:48 INFO mapreduce.Job: Job job_1544603908449_0008 completed successfully
18/12/17 05:50:48 INFO mapreduce.Job: Counters: 30
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=172085
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=87
                HDFS: Number of bytes written=172
                HDFS: Number of read operations=4
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters
                Launched map tasks=1
                Other local map tasks=1
                Total time spent by all maps in occupied slots (ms)=6151
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=6151
                Total vcore-milliseconds taken by all map tasks=6151
                Total megabyte-milliseconds taken by all map tasks=25194496
        Map-Reduce Framework
                Map input records=6
                Map output records=6
                Input split bytes=87
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=68
                CPU time spent (ms)=1220
                Physical memory (bytes) snapshot=392228864
                Virtual memory (bytes) snapshot=6079295488
                Total committed heap usage (bytes)=610795520
        File Input Format Counters
                Bytes Read=0
        File Output Format Counters
                Bytes Written=172
18/12/17 05:50:48 INFO mapreduce.ImportJobBase: Transferred 172 bytes in 18.2966 seconds (9.4006 bytes/sec)
18/12/17 05:50:48 INFO mapreduce.ImportJobBase: Retrieved 6 records.
18/12/17 05:50:48 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners
18/12/17 05:50:48 WARN mapreduce.PublishJobData: Unable to publish import data to publisher org.apache.atlas.sqoop.hook.SqoopHook
java.lang.ClassNotFoundException: org.apache.atlas.sqoop.hook.SqoopHook
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:264)
        at org.apache.sqoop.mapreduce.PublishJobData.publishJobData(PublishJobData.java:46)
        at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:284)
        at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
        at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:127)
        at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:507)
        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:243)
18/12/17 05:50:48 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `test_table_sqoop1` AS t LIMIT 1
18/12/17 05:50:48 INFO hive.HiveImport: Loading uploaded data into Hive




Logging initialized using configuration in jar:file:/usr/hdp/2.6.5.0-292/hive/lib/hive-common-1.2.1000.2.6.5.0-292.jar!/hive-log4j.properties
OK
Time taken: 4.355 seconds
Loading data to table default.test_hive_table4
Table default.test_hive_table4 stats: [numFiles=1, numRows=0, totalSize=172, rawDataSize=0]
OK
Time taken: 3.085 seconds




How to resolve it?

Please suggest. Thanks in advance.

Thanks,

Bhushan

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Atlas Sqoop lineage with Hive is not working

Contributor

Thanks @Geoffrey Shelton Okot for researching on this. I have resolved this issue by following instructions given in this link: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_command-line-installation/content/config...

2 REPLIES 2

Re: Atlas Sqoop lineage with Hive is not working

Mentor

@Bhushan Kandalkar

I have just validated the process and it works, especially the sqoop import please see attached pdf. I suspect you don't have kafka installed if yes it isn't started

  • HDP 2.6.5.0-292
  • Ranger plugins all enable except kafka (no kerberos)
  • Kafka running

96384-ranger-plugins.jpg

Validate that you have Kafka running I didn't see the below output

18/12/17 13:37:08 INFO kafka.KafkaNotification: ==> KafkaNotification()
18/12/17 13:37:08 INFO kafka.KafkaNotification: <== KafkaNotification()
18/12/17 13:37:08 INFO hook.AtlasHook: Created Atlas Hook
18/12/17 13:37:12 INFO kafka.KafkaNotification: ==>
KafkaNotification.createProducer()
18/12/17 13:37:12 INFO producer.ProducerConfig: ProducerConfig values:
acks = 1  
batch.size = 16384  
bootstrap.servers = [nanyuki.kenya.ke:6667]

Hope that helps, please revert


test-lineage2.jpg
Highlighted

Re: Atlas Sqoop lineage with Hive is not working

Contributor

Thanks @Geoffrey Shelton Okot for researching on this. I have resolved this issue by following instructions given in this link: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_command-line-installation/content/config...