Created 04-25-2018 04:34 PM
I ran below sqoop import command:
sqoop import --connect jdbc:mysql://sandbox.hortonworks.com:3306/retail_db --username retail_dba --password hadoop --table categories --hive-import --hive-overwrite --hive-table categories_hive
Error Log:
Warning: /usr/hdp/2.5.0.0-1245/accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 18/04/25 16:01:12 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.5.0.0-1245 18/04/25 16:01:12 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 18/04/25 16:01:12 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override 18/04/25 16:01:12 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc. 18/04/25 16:01:12 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 18/04/25 16:01:13 INFO tool.CodeGenTool: Beginning code generation 18/04/25 16:01:14 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 18/04/25 16:01:14 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 18/04/25 16:01:14 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.5.0.0-1245/hadoop-mapreduce Note: /tmp/sqoop-root/compile/b2f2e1c4f3c2e0ed133a03c460fa67ac/categories.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 18/04/25 16:01:19 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/b2f2e1c4f3c2e0ed133a03c460fa67ac/categories.jar 18/04/25 16:01:19 WARN manager.MySQLManager: It looks like you are importing from mysql. 18/04/25 16:01:19 WARN manager.MySQLManager: This transfer can be faster! Use the --direct 18/04/25 16:01:19 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path. 18/04/25 16:01:19 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) 18/04/25 16:01:19 INFO mapreduce.ImportJobBase: Beginning import of categories 18/04/25 16:01:23 INFO impl.TimelineClientImpl: Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/ 18/04/25 16:01:23 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/172.17.0.2:8050 18/04/25 16:01:23 INFO client.AHSProxy: Connecting to Application History server at sandbox.hortonworks.com/172.17.0.2:10200 18/04/25 16:01:25 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:26 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:27 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:28 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:29 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:30 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:31 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:32 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:33 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:34 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:35 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 10 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:36 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 11 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:37 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 12 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:38 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 13 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:39 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 14 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:40 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 15 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:41 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 16 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:42 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 17 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:01:43 INFO ipc.Client: Retrying connect to server: sandbox.hortonworks.com/172.17.0.2:8050. Already tried 18 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 18/04/25 16:02:06 INFO db.DBInputFormat: Using read commited transaction isolation 18/04/25 16:02:06 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`category_id`), MAX(`category_id`) FROM `categories` 18/04/25 16:02:06 INFO db.IntegerSplitter: Split size: 14; Num splits: 4 from: 1 to: 58 18/04/25 16:02:07 INFO mapreduce.JobSubmitter: number of splits:4 18/04/25 16:02:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1524672099436_0001 18/04/25 16:02:08 INFO impl.YarnClientImpl: Submitted application application_1524672099436_0001 18/04/25 16:02:08 INFO mapreduce.Job: The url to track the job: http://sandbox.hortonworks.com:8088/proxy/application_1524672099436_0001/ 18/04/25 16:02:08 INFO mapreduce.Job: Running job: job_1524672099436_0001 18/04/25 16:02:18 INFO mapreduce.Job: Job job_1524672099436_0001 running in uber mode : false 18/04/25 16:02:18 INFO mapreduce.Job: map 0% reduce 0% 18/04/25 16:02:28 INFO mapreduce.Job: map 25% reduce 0% 18/04/25 16:02:30 INFO mapreduce.Job: map 50% reduce 0% 18/04/25 16:02:31 INFO mapreduce.Job: map 100% reduce 0% 18/04/25 16:02:32 INFO mapreduce.Job: Job job_1524672099436_0001 completed successfully 18/04/25 16:02:32 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=651760 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=472 HDFS: Number of bytes written=1029 HDFS: Number of read operations=16 HDFS: Number of large read operations=0 HDFS: Number of write operations=8 Job Counters Launched map tasks=4 Other local map tasks=4 Total time spent by all maps in occupied slots (ms)=28839 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=28839 Total vcore-milliseconds taken by all map tasks=28839 Total megabyte-milliseconds taken by all map tasks=7209750 Map-Reduce Framework Map input records=58 Map output records=58 Input split bytes=472 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=1520 CPU time spent (ms)=4770 Physical memory (bytes) snapshot=563298304 Virtual memory (bytes) snapshot=7735148544 Total committed heap usage (bytes)=176160768 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=1029 18/04/25 16:02:32 INFO mapreduce.ImportJobBase: Transferred 1.0049 KB in 69.6062 seconds (14.7832 bytes/sec) 18/04/25 16:02:32 INFO mapreduce.ImportJobBase: Retrieved 58 records. 18/04/25 16:02:32 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners 18/04/25 16:02:32 INFO atlas.ApplicationProperties: Looking for atlas-application.properties in classpath 18/04/25 16:02:32 INFO atlas.ApplicationProperties: Loading atlas-application.properties from file:/etc/sqoop/2.5.0.0-1245/0/atlas-application.properties 18/04/25 16:02:32 ERROR security.InMemoryJAASConfiguration: Unable to add JAAS configuration for client [KafkaClient] as it is missing param [atlas.jaas.KafkaClient.loginModuleName]. Skipping JAAS config for [KafkaClient] 18/04/25 16:02:32 INFO hook.AtlasHook: Created Atlas Hook 18/04/25 16:02:33 INFO producer.ProducerConfig: ProducerConfig values: metric.reporters = [] metadata.max.age.ms = 300000 reconnect.backoff.ms = 50 sasl.kerberos.ticket.renew.window.factor = 0.8 bootstrap.servers = [sandbox.hortonworks.com:6667] ssl.keystore.type = JKS sasl.mechanism = GSSAPI max.block.ms = 60000 interceptor.classes = null ssl.truststore.password = null client.id = ssl.endpoint.identification.algorithm = null request.timeout.ms = 30000 acks = 1 receive.buffer.bytes = 32768 ssl.truststore.type = JKS retries = 0 ssl.truststore.location = null ssl.keystore.password = null send.buffer.bytes = 131072 compression.type = none metadata.fetch.timeout.ms = 60000 retry.backoff.ms = 100 sasl.kerberos.kinit.cmd = /usr/bin/kinit buffer.memory = 33554432 timeout.ms = 30000 key.serializer = class org.apache.kafka.common.serialization.StringSerializer sasl.kerberos.service.name = null sasl.kerberos.ticket.renew.jitter = 0.05 ssl.trustmanager.algorithm = PKIX block.on.buffer.full = false ssl.key.password = null sasl.kerberos.min.time.before.relogin = 60000 connections.max.idle.ms = 540000 max.in.flight.requests.per.connection = 5 metrics.num.samples = 2 ssl.protocol = TLS ssl.provider = null ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1] batch.size = 16384 ssl.keystore.location = null ssl.cipher.suites = null security.protocol = PLAINTEXT max.request.size = 1048576 value.serializer = class org.apache.kafka.common.serialization.StringSerializer ssl.keymanager.algorithm = SunX509 metrics.sample.window.ms = 30000 partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner linger.ms = 0 18/04/25 16:02:33 INFO producer.ProducerConfig: ProducerConfig values: metric.reporters = [] metadata.max.age.ms = 300000 reconnect.backoff.ms = 50 sasl.kerberos.ticket.renew.window.factor = 0.8 bootstrap.servers = [sandbox.hortonworks.com:6667] ssl.keystore.type = JKS sasl.mechanism = GSSAPI max.block.ms = 60000 interceptor.classes = null ssl.truststore.password = null client.id = producer-1 ssl.endpoint.identification.algorithm = null request.timeout.ms = 30000 acks = 1 receive.buffer.bytes = 32768 ssl.truststore.type = JKS retries = 0 ssl.truststore.location = null ssl.keystore.password = null send.buffer.bytes = 131072 compression.type = none metadata.fetch.timeout.ms = 60000 retry.backoff.ms = 100 sasl.kerberos.kinit.cmd = /usr/bin/kinit buffer.memory = 33554432 timeout.ms = 30000 key.serializer = class org.apache.kafka.common.serialization.StringSerializer sasl.kerberos.service.name = null sasl.kerberos.ticket.renew.jitter = 0.05 ssl.trustmanager.algorithm = PKIX block.on.buffer.full = false ssl.key.password = null sasl.kerberos.min.time.before.relogin = 60000 connections.max.idle.ms = 540000 max.in.flight.requests.per.connection = 5 metrics.num.samples = 2 ssl.protocol = TLS ssl.provider = null ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1] batch.size = 16384 ssl.keystore.location = null ssl.cipher.suites = null security.protocol = PLAINTEXT max.request.size = 1048576 value.serializer = class org.apache.kafka.common.serialization.StringSerializer ssl.keymanager.algorithm = SunX509 metrics.sample.window.ms = 30000 partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner linger.ms = 0 18/04/25 16:02:33 WARN producer.ProducerConfig: The configuration key.deserializer = org.apache.kafka.common.serialization.StringDeserializer was supplied but isn't a known config. 18/04/25 16:02:33 WARN producer.ProducerConfig: The configuration value.deserializer = org.apache.kafka.common.serialization.StringDeserializer was supplied but isn't a known config. 18/04/25 16:02:33 WARN producer.ProducerConfig: The configuration hook.group.id = atlas was supplied but isn't a known config. 18/04/25 16:02:33 WARN producer.ProducerConfig: The configuration partition.assignment.strategy = roundrobin was supplied but isn't a known config. 18/04/25 16:02:33 WARN producer.ProducerConfig: The configuration zookeeper.connection.timeout.ms = 200 was supplied but isn't a known config. 18/04/25 16:02:33 WARN producer.ProducerConfig: The configuration zookeeper.session.timeout.ms = 400 was supplied but isn't a known config. 18/04/25 16:02:33 WARN producer.ProducerConfig: The configuration zookeeper.connect = sandbox.hortonworks.com:2181 was supplied but isn't a known config. 18/04/25 16:02:33 WARN producer.ProducerConfig: The configuration zookeeper.sync.time.ms = 20 was supplied but isn't a known config. 18/04/25 16:02:33 WARN producer.ProducerConfig: The configuration auto.offset.reset = smallest was supplied but isn't a known config. 18/04/25 16:02:33 INFO utils.AppInfoParser: Kafka version : 0.10.0.2.5.0.0-1245 18/04/25 16:02:33 INFO utils.AppInfoParser: Kafka commitId : dae559f56f07e2cd 18/04/25 16:03:33 ERROR hook.AtlasHook: Failed to send notification - attempt #1; error=java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
Created 04-28-2018 04:38 PM
Indeed I see a lot of warnings but the job completed successfully, below is an extract of your log.
18/04/25 16:02:18 INFO mapreduce.Job: map 0% reduce 0% 18/04/25 16:02:28 INFO mapreduce.Job: map 25% reduce 0% 18/04/25 16:02:30 INFO mapreduce.Job: map 50% reduce 0% 18/04/25 16:02:31 INFO mapreduce.Job: map 100% reduce 0% 18/04/25 16:02:32 INFO mapreduce.Job: Job job_1524672099436_0001 completed successfully
Can you run a
Select * from categories_hive
Please revert
Created 04-28-2018 01:03 PM
Could anyone please help. i am stuck.
Created 04-28-2018 04:38 PM
Indeed I see a lot of warnings but the job completed successfully, below is an extract of your log.
18/04/25 16:02:18 INFO mapreduce.Job: map 0% reduce 0% 18/04/25 16:02:28 INFO mapreduce.Job: map 25% reduce 0% 18/04/25 16:02:30 INFO mapreduce.Job: map 50% reduce 0% 18/04/25 16:02:31 INFO mapreduce.Job: map 100% reduce 0% 18/04/25 16:02:32 INFO mapreduce.Job: Job job_1524672099436_0001 completed successfully
Can you run a
Select * from categories_hive
Please revert
Created 04-30-2018 05:05 AM
@Geoffrey Shelton Okot Yepp you are right. The table is created successfully. I was facing too many errors and then fixed few by starting Atlas components,Kafka, HBase and Ambari Infra services but was unable to resolve the above timeout error. Thank you so much for your help. it means a lot to someone who is new to this world like me.