Member since
08-08-2013
109
Posts
18
Kudos Received
10
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4856 | 07-28-2015 02:13 PM | |
63711 | 12-12-2014 07:28 AM | |
3892 | 11-19-2014 04:14 PM | |
4776 | 11-17-2014 10:35 AM | |
7441 | 08-05-2014 10:56 AM |
07-28-2015
02:23 PM
What ever is available in your CDH distribution. If you're using parcels, take a look at /opt/cloudera/parcles/CDH/jars. If you're using packages, take a look at /usr/lib/kite. You can upload that jar to any directory in HDFS and make it available to your workflow by including it using the "file" element (https://oozie.apache.org/docs/4.0.0/WorkflowFunctionalSpec.html#a3.2.2.1_Adding_Files_and_Archives_for_the_Job). That would probably be the easiest way to test.
... View more
07-28-2015
02:13 PM
It looks like "kite-data-mapreduce.jar" is missing from the Oozie sharelib. We're tracking this internally. Until then, could you upload this to the sharelib or add it to the Sqoop action as a "file"?
... View more
06-23-2015
03:06 PM
1 Kudo
Using the command soley or args soley should work. The DTD for sqoop looks like this: <xs:complexType name="ACTION">
<xs:sequence>
<xs:element name="job-tracker" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="name-node" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="prepare" type="sqoop:PREPARE" minOccurs="0" maxOccurs="1"/>
<xs:element name="job-xml" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="configuration" type="sqoop:CONFIGURATION" minOccurs="0" maxOccurs="1"/>
<xs:choice>
<xs:element name="command" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="arg" type="xs:string" minOccurs="1" maxOccurs="unbounded"/>
</xs:choice>
<xs:element name="file" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="archive" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType> Choice means it's either one or the other.
... View more
06-10-2015
03:36 AM
I believe this option controls where the data is written in HDFS before it's loaded into Hive. When the data is loaded into Hive, it will be moved again. For details on where it's loaded into HDFS before it's loaded into Hive, see http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_connecting_to_a_database_server
... View more
05-07-2015
11:09 AM
Hey there, what do your task logs say?
... View more
04-07-2015
01:28 PM
Ah yes, I'm sorry about that. Have you tried using the Sqoop codeget tool to see the result HQL? You can run the HQL and dig deeper into the root of the problem?
... View more
04-02-2015
12:36 PM
There should be more information in the task/attempt logs and if you provide the "--verbose" option. My best guess is that it's a permissions issue.
... View more
04-02-2015
12:33 PM
We're still working on better facilities to look at Sqoop2 logs, but there are currently a few locations you can look (depending on your error). I'm assuming Sqoop2 starts fine, but a job you're running is failing. If that is the case, then your logs should be under /var/log/sqoop2/ and in its associated task/attempt logs. You can view these logs through the Yarn UIs (usually on port 8088. e.g. http://example.com:8088) or through Hue (http://gethue.com/).
... View more
04-02-2015
11:11 AM
1 Kudo
It should be. The Generic JDBC Connector should be able to connect to MS SQL server. It's important to make sure the SQL server JDBC jar is included in Sqoop2 server's classpath via /var/lib/sqoop2 directory.
... View more
03-05-2015
11:49 AM
Hmm I'd ask this question in the "Batch" forum. This seems like a Yarn misconfiguration or resourcing issue.
... View more
03-04-2015
11:42 AM
Hey man, I'd take a look at the resource manager UI either from the resource manager or via Hue. The resource manager UI is available on port 8088 usually. Via Hue, it would be through the JobTracker app. If the job stays in "SUBMITTED" or "ACCEPTED" state (consequently never reaches "RUNNING" state), then there's likely a utilization or configuration issue. Could you check the resource manager through one of the above UIs while running your sqoop job and provide screen shots?
... View more
02-11-2015
03:21 PM
There is an open Jira for getting Sqoop2 to output data in a format that Hive will accept: https://issues.apache.org/jira/browse/SQOOP-1579. In reality, this shouldn't stop you, but the Jira does have an example that you might be able to use. This will affect you if you have content that needs to be escaped though. I'm on it! Here's an example of "Table SQL Statement": "SELECT * FROM example WHERE ${CONDITIONS}". - "Table column names" is used to provide a list of table columns from a table that should be extracted. This should not be used in conjunction with "Table SQL Statement". It should only be used if you provide a "Table name". - "Partition column name" defines which column should be used to partition the data transfer operation. By default, the primary key of the table will be used. If your table doesn't have a primary key, then this must be set. - "Null in partition column" is a boolean that defines whether the partition column should accept NULL values. Hope this helps...
... View more
02-06-2015
05:56 PM
Hey folks, A few notes on getting Sqoop2 to work and give more detail if there is an error: 1. The Derby jar is required for Sqoop2 to work. It does come packaged with Sqoop2, so do not put it into /var/lib/sqoop2. 2. You can use "set option --name verbose --value true" to see more error information. You may be having different errors for all we know.
... View more
02-04-2015
05:27 PM
From the CLI, Hadoop normally shows: "ls: Permission denied: user=admin..." if you don't have permission. What is your --target-dir set to?
... View more
02-04-2015
03:38 PM
It seems like the job didn't run. Are you sure it finished? There should be task logs associated with your job. Could you check them for errors? Also, make sure --verbose is in your command for the best possible debug information.
... View more
02-04-2015
02:07 PM
Sqoop2 is going throgh a lot of changes and isn't supported in Hue in CDH 5.3.0. It should be in CDH 5.3.2 AFAIK. The Sqoop1 command you are running have two different problems: The postgresql jdbc driver jar isn't available on one of your nodes. This can be fixed by downloading it and putting it in /var/lib/sqoop. The problem seems to be some kind of SQL error. Could you add --verbose to your sqoop command? sqoop import --verbose ...
... View more
02-04-2015
10:39 AM
Assuming you're using CDH 4? I'd upgrade to CDH 5 and use the Sqoop2 app. If you're prefer to use the shell, then make sure the user you're logged in as has a linux account with an ID larger than 500 on the Hue server.
... View more
02-02-2015
11:14 AM
This seems like a connectivity error between the Sqoop MR job and the database. A few of questions for you: 1. Do you have your database credentials exactly right? 2. Is your database service running?
... View more
12-12-2014
07:28 AM
1 Kudo
Yeah it's only a warning. To make the warning go away you need to install ACCUMULO.
... View more
12-09-2014
07:28 AM
Could you run your command with --verbose right after "import"? Then copy paste the contents to this thread?
... View more
12-05-2014
01:46 PM
1 Kudo
Check your connection URL as well. sqoop import --connect jdbc:sqlserver://server;database=db --table table
... View more
12-03-2014
04:08 PM
Glad it worked. Does that mean your job is running now?
... View more
12-03-2014
01:59 PM
1 Kudo
You shouldn't need to provide '--driver'. Try removing it from your sqoop command.
... View more
12-02-2014
03:37 PM
This will be possible in CDH 5.3.0.
... View more
11-28-2014
05:19 PM
1 Kudo
Sqoop2 doesn't support Kerberos currently. This will be available in CDH 5.3.0. Check out https://issues.apache.org/jira/browse/SQOOP-1527 for more info on Kerberos support.
... View more
11-19-2014
04:14 PM
Hey there, Have you fiddled with the code generation options? I think you're looking for "-- class-name". http://sqoop.apache.org/docs/1.4.5/SqoopUserGuide.html#idp6782480 -Abe
... View more
11-17-2014
10:35 AM
It seems the issue is that your partition column can't be found. Sqoop should, by default, use the primary key as your partition column. If your table doesn't have a primary key, then you'll have to set it your self in "Partition column name".
... View more
11-15-2014
10:29 AM
Sqoop2 actually only has one connector, but can use different jdbc drivers to connect to a relational database. There's work to be done to make optimizations when transfering data, but the generic jdbc connector should work. Could you try running your job with "set option --name verbose --value true"? Also, please provide the output of "show connection --all" and "show job --all".
... View more
11-14-2014
02:56 PM
Also, please remove "sqljdbc.jar" as it could be taking precedence.
... View more
11-14-2014
02:54 PM
What is your exact error message with stack trace? You can turn on verbose logging via "set option --name verbose --value true"
... View more