Member since
12-18-2013
16
Posts
0
Kudos Received
0
Solutions
09-30-2015
08:28 AM
Hello- We're using Sqoop 1.4.5 with Oracle 12c and find that Sqoop --direct exports to Oracle are comsuming drastically more space compared to Sqlldr. Specifics: Example data set: 24,087,140 records The destination Oracle table is partitioned on one column and the table uses basic compression. Results for Sqoop export (command below): 2,818,572,288 bytes and 86,016 blocks Results for Sqlldr: 872,415,232 bytes 26,624 blocks Obviously a huge difference for Oracle storage resources. We would appreciate any advice/insight regarding these findings. Please let me know if I can provide any additional information. Sqoop command: $ sqoop-1.4.5/bin/sqoop export \ -D mapred.child.java.opts='-Xmx4g' \ -D sqoop.export.records.per.statement=5000 \ -D sqoop.export.statements.per.transaction=1000 \ -D mapred.task.timeout=0 \ --connect jdbc:oracle:thin:@test:1521/TEST \ --username test \ --password test \ --direct \ --table TEST \ --input-null-string '\\N' \ --input-null-non-string '\\N' \ --export-dir test_test \ --fields-terminated-by '|' \ -m 18 Thanks
... View more
09-04-2014
11:22 AM
Ok I started over creating the external table in Hue as show in the linked video. I next executed the insert overwrite command (with the jars added) and ended up with the following error output: Loading data to table default.destination_table
Failed with exception Unable to move sourcehdfs://som-dmsandbox01.server.net:8020/tmp/hive-beeswax-admin/hive_2014-09-04_11-12-14_958_4333053320296609322-1/-ext-10000 to destination /user/hive/warehouse/destination_table
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
MapReduce Jobs Launched:
Job 0: Map: 2 Reduce: 1 Cumulative CPU: 5.17 sec HDFS Read: 892 HDFS Write: 3858 SUCCESS
Total MapReduce CPU Time Spent: 5 seconds 170 msec
... View more
09-04-2014
09:59 AM
Thanks for your continued help. I have the jars from your suggestion and am still getting the same result. hive has no problem with the insert overwrite statement (writing to a hive table not hbase) when I execute through CLI: hive --auxpath /opt/cloudera/parcels/CDH/lib/hive/lib/hbase.jar,/opt/cloudera/parcels/CDH/lib/hive/lib/hive-hbase-handler-0.10.0-cdh4.7.0.jar,/opt/cloudera/parcels/CDH/lib/hive/lib/zookeeper.jar -f hivequery.hql where hivequery.hql is: insert overwrite table destination_table select t1.* from hive_external_hbase_table t1 left outer join hive_table t2 on t1.field = t2.field where t2.field is null; frustrating!
... View more
09-03-2014
04:04 PM
...and ends with " NoServerForRegionException" HBase error? org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for f_meta_data,,99999999999999 after 14 tries.
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1092)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:997)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1099)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1001)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:958)
at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:288)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:192)
at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBaseTableInputFormat.java:435)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:292)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:292)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1134)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1126)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:178)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:1023)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:976)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:976)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:950)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1383)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1169)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:982)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:445)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:455)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:713)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:302)
at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:260)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)
at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:495)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Job Submission failed with exception 'org.apache.hadoop.hbase.client.NoServerForRegionException(Unable to find region for f_meta_data,,99999999999999 after 14 tries.)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask
Log file: /mapred/local/taskTracker/admin/jobcache/job_201409031436_0017/attempt_201409031436_0017_m_000000_0/work/hive-oozie-job_201409031436_0017.log not present. Therefore no Hadoop jobids found
Intercepting System.exit(1)
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [1]
log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ClientCnxn).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
... View more
09-03-2014
03:39 PM
Thanks, that got me moving forward a bit. Now my job seems to get hung up on the following: >>> Invoking Hive command line now >>>
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
... View more
09-02-2014
03:45 PM
Hi I apologize if this has been asked and answered but could not find anything. In order to execute low-complexity queries (join etc.) on a Hive external table based on a Hbase table using the CLI, I set the auxpath to the following: hive --auxpath /opt/cloudera/parcels/CDH/lib/hive/lib/hbase.jar,/opt/cloudera/parcels/CDH/lib/hive/lib/hive-hbase-handler-0.10.0-cdh4.7.0.jar Ultimately I want to use Hue's Oozie GUI to set up workflows using one of these Hive/Hbase tables, but don't know how to go about getting the auxpaths involved. Trying without those jars I get the typical error output for either Beeswax or Oozie. Appreciate any pointers: Cannot create an instance of InputSplit class = org.apache.hadoop.hive.hbase.HBaseSplit:Class org.apache.hadoop.hive.hbase.HBaseSplit not found
... View more
02-04-2014
08:22 AM
Thanks for you input Clint. I ended up resolving the problem by: So it looks like in all I needed to: Pointing Sqoop-env.xml to hadoop-0.20-mapreduce Capitalizing both owner and table name: SQOOP.TEST Adding –m1 at the end to make up for the lack of a primary key. Thanks, BC
... View more
01-30-2014
09:41 AM
Hello- I'm trying to sqoop data from oracle to hdfs but getting the following error: $ sqoop import --connect jdbc:oracle:thin:@localhost:1521/DB11G --username sqoop --password xx --table sqoop.test … 14/01/30 10:58:10 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-oracle/compile/fa0ce9acd6ac6d0c349389a6dbfee62b/sqoop.test.jar 14/01/30 10:58:10 INFO mapreduce.ImportJobBase: Beginning import of sqoop.test 14/01/30 10:58:10 WARN conf.Configuration: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 14/01/30 10:58:10 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/01/30 10:58:10 INFO manager.SqlManager: Executing SQL statement: SELECT FIRST,LAST,EMAIL FROM sqoop.test WHERE 1=0 14/01/30 10:58:11 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/01/30 10:58:11 ERROR security.UserGroupInformation: PriviledgedActionException as:oracle (auth:SIMPLE) cause:java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. 14/01/30 10:58:11 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:122) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:84) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:77) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1239) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1235) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapreduce.Job.connect(Job.java:1234) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1263) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1287) at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:186) at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:159) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:247) at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:606) at com.quest.oraoop.OraOopConnManager.importTable(OraOopConnManager.java:260) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:413) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:502) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231) at org.apache.sqoop.Sqoop.main(Sqoop.java:240) Checking just the Database side works ok: $ sqoop list-tables --connect jdbc:oracle:thin:@localhost:1521:DB11G --username sqoop --password xx Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. 14/01/30 12:12:20 INFO sqoop.Sqoop: Running Sqoop version: 1.4.3-cdh4.5.0 14/01/30 12:12:20 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/01/30 12:12:20 INFO manager.SqlManager: Using default fetchSize of 1000 14/01/30 12:12:21 INFO manager.OracleManager: Time zone has been set to GMT TEST Any thoughts? Thanks, BC
... View more
01-27-2014
06:46 AM
In essence, yes. However I'm working with the latest version of Cloudera (Hbase) and GORM seems to be based on much earlier versions of Hbase. Do you know if GORM still works? Thanks, BC
... View more
01-24-2014
11:52 AM
Hello- Is there a Hibernate plugin for using Grails on top of Hbase? Thanks very much, BC
... View more
12-26-2013
03:07 PM
Well that seems complicated:) Sure that this is not a configuration problem with the cloudera manager rather than certs?
... View more
12-23-2013
08:44 AM
Trying to get the oozie web ui up through cloudera manager, followed instructions here: http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/4.6.3/Cloudera-Manager-Managing-Clusters/cmmc_oozie_service.html no luck. Any suggestions appreciated. Thanks. BC
... View more
12-18-2013
12:52 PM
ohhhh the search box on the left I missed that somehow. Thanks very much!
... View more
12-18-2013
12:12 PM
Thanks, but none of those bring up the heap space parameter. The only config option i get for secondary name node is: "HDFS Checkpoint Directory" Weird. Is there perhaps a link i can use within CM to get to these settings? Thanks
... View more
12-18-2013
10:40 AM
Hi - when I had a config warning for java heap on Name Node, i was taken to a settings page (withing cloudera manager) to change the settings. I forgot to change the Secondary Name Node heap size to match and now I can't seem to get back to that settings page. Appreciate any help. Thanks, BC
... View more