Support Questions

AKB · ‎05-01-2018

sqoop import --connect 'jdbc:sqlserver://data-dev.dev.eso.local;database=KUDU_40M' \
--username xxxx --password xxxx --driver com.microsoft.sqlserver.jdbc.SQLServerDriver \
--table dim_PatientEncounter --hive-table kudu_40m.dim_PatientEncounter --create-hive-table --hive-import --warehouse-dir /user/hive/warehouse/testdb -m 4

The table has 30 million+ rows.

Running the above command, always seems to hang at mapper 3 of 4. Nothing much in the logs besides:

2018-05-01 08:22:53,364 INFO [IPC Server handler 7 on 45249] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1525122694968_0001_m_000003_0 is : 0.0

Changes I made:

mapreduce.map.memory.mb = 5 GB

mapreduce.reduce.memory.mb = 5GB

yarn.nodemanager.resource.memory-mb = 20GB (NodeManager Default Group)

yarn.nodemanager.resource.memory-mb = 8GB (NodeManager Group 1)

Cluster MetricsApps Submitted Apps Pending Apps Running Apps Completed Containers Running Memory Used Memory Total Memory Reserved VCores Used VCores Total VCores Reserved

1

0

1

0

3

11 GB

68 GB

0 B

3

32

0

Cluster Nodes MetricsActive Nodes Decommissioning Nodes Decommissioned Nodes Lost Nodes Unhealthy Nodes Rebooted Nodes

4

0

User Metrics for dr.whoApps Submitted Apps Pending Apps Running Apps Completed Containers Running Containers Pending Containers Reserved Memory Used Memory Pending Memory Reserved VCores Used VCores Pending VCores Reserved

0

0 B

0

IDUserNameApplication TypeQueueStartTimeFinishTimeStateFinalStatusRunning ContainersAllocated CPU VCoresAllocated Memory MBReserved CPU VCoresReserved Memory MBProgressTracking UI

application_1525122694968_0001

root

dim_PatientEncounter.jar

MAPREDUCE

root.users.root

Tue May 1 08:11:49 -0500 2018

N/A

RUNNING

UNDEFINED

3

11264

0

ApplicationMaster

GeKas · ‎05-04-2018

You are using sqoop1.

Sqoop1 is not a service, it is a tool, that submits the job to YARN.

So, apart from your stdout and YARN logs, there are no sqoop logs.

Number of mappers (-m 4) means that your job will open 4 connections to your database. If there is no indication in logs about out of memory or illegal value on a column (I mean yarn logs), then you should check that your DB can accept 4 concurrent connections.

View solution in original post

GeKas · ‎05-04-2018

You are using sqoop1.

Sqoop1 is not a service, it is a tool, that submits the job to YARN.

So, apart from your stdout and YARN logs, there are no sqoop logs.

Number of mappers (-m 4) means that your job will open 4 connections to your database. If there is no indication in logs about out of memory or illegal value on a column (I mean yarn logs), then you should check that your DB can accept 4 concurrent connections.

Cloudera Community

Support Questions

SQL Server table to Hive table using Sqoop

Creating HIVE partitioned tables using sqoop

RDBMS to Hive using NiFi (small-medium tables)

Updating hive table with sqoop from mysql table

Displaying Images in SQL Tables in Zeppelin

Creating HBase HFiles From a Hive Table

Scheduled Incremental Ingestion of MS-SQL data to ...

Incrimental update in HIVE table using sqoop

Postgres to Sqoop to Hive Orc Table Using hcatalog

Sqoop import SQL Server NON-DEFAULT schema

how do i create hive orc table and how to load...