Support Questions
Find answers, ask questions, and share your expertise

Unable to publish import data to publisher org.apache.atlas.sqoop.hook.SqoopHook

New Contributor

trying to import a table from MySQL to Hive table using Sqoop. while importing the table I am encountering the above error.

Looking at this, I can see 6 rows retrieved, but not published to hive.

Unable to publish import data to publisher org.apache.atlas.sqoop.hook.SqoopHook                           
[root@sandbox ~]# sqoop import --hive-import --connect jdbc:mysql://localhost/employees --table testing --username root --password hadoop --num-mappers 1 --driver com.m
ysql.jdbc.Driver
Warning: /usr/hdp/2.6.0.3-8/accumulo does not exist! Accumulo imports will fail.                                                                                        
Please set $ACCUMULO_HOME to the root of your Accumulo installation.                                                                                                    
17/07/09 22:47:00 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.6.0.3-8                                                                                              
17/07/09 22:47:00 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.                                            
17/07/09 22:47:00 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override                                                                  
17/07/09 22:47:00 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.                                                                                 
17/07/09 22:47:00 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-mana
ger). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.          
17/07/09 22:47:00 INFO manager.SqlManager: Using default fetchSize of 1000                                                                                              
17/07/09 22:47:00 INFO tool.CodeGenTool: Beginning code generation                                                                                                      
17/07/09 22:47:00 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM testing AS t WHERE 1=0                                                              
17/07/09 22:47:00 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM testing AS t WHERE 1=0                                                              
17/07/09 22:47:00 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.6.0.3-8/hadoop-mapreduce                                                                
Note: /tmp/sqoop-root/compile/81c471cd6ea2a0f3d1e1b406587e5a62/testing.java uses or overrides a deprecated API.                                                         
Note: Recompile with -Xlint:deprecation for details.                                                                                                                    
17/07/09 22:47:02 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/81c471cd6ea2a0f3d1e1b406587e5a62/testing.jar                                   
17/07/09 22:47:02 INFO mapreduce.ImportJobBase: Beginning import of testing                                                                                             
17/07/09 22:47:02 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM testing AS t WHERE 1=0                                                              
17/07/09 22:47:03 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/172.17.0.2:8032                                                         
17/07/09 22:47:03 INFO client.AHSProxy: Connecting to Application History server at sandbox.hortonworks.com/172.17.0.2:10200                                            
17/07/09 22:47:08 INFO db.DBInputFormat: Using read commited transaction isolation                                                                                      
17/07/09 22:47:09 INFO mapreduce.JobSubmitter: number of splits:1                                                                                                       
17/07/09 22:47:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1499640282113_0001                                                                        
17/07/09 22:47:10 INFO impl.YarnClientImpl: Submitted application application_1499640282113_0001                                                                        
17/07/09 22:47:10 INFO mapreduce.Job: The url to track the job: http://sandbox.hortonworks.com:8088/proxy/application_1499640282113_0001/ 
17/07/09 22:47:10 INFO mapreduce.Job: Running job: job_1499640282113_0001                                                                                               
17/07/09 22:47:20 INFO mapreduce.Job: Job job_1499640282113_0001 running in uber mode : false                                                                           
17/07/09 22:47:20 INFO mapreduce.Job:  map 0% reduce 0%                                                                                                                 
17/07/09 22:47:27 INFO mapreduce.Job:  map 100% reduce 0%                                                                                                               
17/07/09 22:47:27 INFO mapreduce.Job: Job job_1499640282113_0001 completed successfully                                                                                 
17/07/09 22:47:27 INFO mapreduce.Job: Counters: 30                                                                                                                      
        File System Counters                                                                                                                                            
                FILE: Number of bytes read=0                                                                                                                            
                FILE: Number of bytes written=166193                                                                                                                    
                FILE: Number of read operations=0                                                                                                                       
                FILE: Number of large read operations=0                                                                                                                 
                FILE: Number of write operations=0                                                                                                                      
                HDFS: Number of bytes read=87                                                                                                                           
                HDFS: Number of bytes written=93                                                                                                                        
                HDFS: Number of read operations=4                                                                                                                       
                HDFS: Number of large read operations=0                                                                                                                 
                HDFS: Number of write operations=2                                                                                                                      
        Job Counters                                                                                                                                                    
                Launched map tasks=1                                                                                                                                    
                Other local map tasks=1                                                                                                                                 
                Total time spent by all maps in occupied slots (ms)=3081                                                                                                
                Total time spent by all reduces in occupied slots (ms)=0                                                                                                
                Total time spent by all map tasks (ms)=3081                                                                                                             
                Total vcore-milliseconds taken by all map tasks=3081                                                                                                    
                Total megabyte-milliseconds taken by all map tasks=770250                                                                                               
        Map-Reduce Framework                                                                                                                                            
                Map input records=6                                                                                                                                     
                Map output records=6                                                                                                                                    
                Input split bytes=87                                                                                                                                    
                Spilled Records=0                                                                                                                                       
                Failed Shuffles=0                                                                                                                                       
                Merged Map outputs=0                                                                                                                                    
                GC time elapsed (ms)=61                                                                                                                                 
                CPU time spent (ms)=780                                                                                                                                 
                Physical memory (bytes) snapshot=132890624                                                                                                              
                Virtual memory (bytes) snapshot=2134142976                                                                                                              
                Total committed heap usage (bytes)=40370176                                                                                                             
        File Input Format Counters                                                                                                                                      
                Bytes Read=0                                                                                                                                            
        File Output Format Counters                                                                                                                                     
                Bytes Written=93                                                                                                                                        
17/07/09 22:47:27 INFO mapreduce.ImportJobBase: Transferred 93 bytes in 24.0499 seconds (3.867 bytes/sec)                                                               
17/07/09 22:47:27 INFO mapreduce.ImportJobBase: Retrieved 6 records.                                                                                                    
17/07/09 22:47:27 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners                                                                       
17/07/09 22:47:27 WARN mapreduce.PublishJobData: Unable to publish import data to publisher org.apache.atlas.sqoop.hook.SqoopHook                                       
java.lang.ClassNotFoundException: org.apache.atlas.sqoop.hook.SqoopHook                                                                                                 
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)                                                                                                   
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)                                                                                                        
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)                                                                                                
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)                                                                                                        
        at java.lang.Class.forName0(Native Method)                                                                                                                      
        at java.lang.Class.forName(Class.java:264)                                                                                                                      
        at org.apache.sqoop.mapreduce.PublishJobData.publishJobData(PublishJobData.java:46)                                                                             
        at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:284)                                                                                   
        at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)                                                                                         
        at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:507)                                                                                            
        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615)                                                                                                    
        at org.apache.sqoop.Sqoop.run(Sqoop.java:147)                                                                                                                   
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)                                                                                                    
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)                                                                                                              
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225)                                                                                                               
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)                                                                                                               
        at org.apache.sqoop.Sqoop.main(Sqoop.java:243)                                                                                                                  
17/07/09 22:47:27 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM testing AS t WHERE 1=0                                                              
17/07/09 22:47:27 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM testing AS t WHERE 1=0                                                              
17/07/09 22:47:27 INFO hive.HiveImport: Loading uploaded data into Hive                                                                                                 

Logging initialized using configuration in jar:file:/usr/hdp/2.6.0.3-8/hive/lib/hive-common-1.2.1000.2.6.0.3-8.jar!/hive-log4j.properties                               
OK                                                                                                                                                                      
Time taken: 2.054 seconds                                                                                                                                               
Loading data to table default.testing                                                                                                                                   
Table default.testing stats: [numFiles=3, numRows=0, totalSize=279, rawDataSize=0]                                                                                      
OK                                                                                                                                                                      
Time taken: 1.644 seconds    
3 REPLIES 3

Re: Unable to publish import data to publisher org.apache.atlas.sqoop.hook.SqoopHook

@Ramkumar Rajamani

The issue is due to missing configurations for Sqoop-Atlas hook. Refer to link for details on configuration.

Re: Unable to publish import data to publisher org.apache.atlas.sqoop.hook.SqoopHook

Cloudera Employee

Reference URL:

https://atlas.apache.org/Hook-Sqoop.html

  • Link <atlas package>/hook/sqoop/*.jar in sqoop lib

In HDP, you can find all the hook jar files in atlas hook folder.

hook folder: /usr/hdp/current/atlas-server/hook/sqoop/atlas-sqoop-plugin-impl

106412-1550421211386.png


So, after building symbolic link for these jar files in sqoop folder, This problem was resolved.

https://gist.github.com/zz22394/6c004731423fb11095aa41ac19de3393

106421-1550421530864.png

sudo ln -s /usr/hdp/current/atlas-server/hook/sqoop/atlas-sqoop-plugin-impl/atlas-client-common-1.1.0.3.1.0.0-78.jar /usr/hdp/current/sqoop-server/lib/zz-atlas-client-common-1.1.0.3.1.0.0-78.jar
sudo ln -s /usr/hdp/current/atlas-server/hook/sqoop/atlas-sqoop-plugin-impl/atlas-client-v1-1.1.0.3.1.0.0-78.jar /usr/hdp/current/sqoop-server/lib/zz-atlas-client-v1-1.1.0.3.1.0.0-78.jar
sudo ln -s /usr/hdp/current/atlas-server/hook/sqoop/atlas-sqoop-plugin-impl/atlas-client-v2-1.1.0.3.1.0.0-78.jar /usr/hdp/current/sqoop-server/lib/zz-atlas-client-v2-1.1.0.3.1.0.0-78.jar

Re: Unable to publish import data to publisher org.apache.atlas.sqoop.hook.SqoopHook

Expert Contributor

Unfortunately, linking jar files doesn't help. Still the same error:

java.lang.ClassNotFoundException: org.apache.atlas.sqoop.hook.SqoopHook