Created on 09-09-2014 02:38 PM - edited 09-16-2022 02:07 AM
I am attempting to run a sqoop command with a free-form query, because I need to perform an aggregation. The following is a scaled-down version of the command and query. When the command is processed, the "--query" statement (enclosed in quotes) results in each portion of the query to be interpreted as unrecognized arguments, as shown in the error following the command. In addition, the target directory is being misinterpreted. What is preventing this from running, and what can be done to resolve it? The ${env} and ${shard} variables are being properly parsed, as reflected in the last error message.
Edit: This command is being submitted via the Hue interface, as an Oozie workflow.
Thank you!
Michael Reynolds
import --connect jdbc:mysql://irbasedw-${shard}.db.impactradius.net:3417/irbasedw_${shard}?dontTrackOpenResources=true&defaultFetchSize=10000&useCursorFetch=true --username iretl --password-file /irdw/${env}/lib/.passwordBaseDw --table agg_daily_activity_performance_stage -m 1 --query "SELECT SUM(click_count) FROM agg_daily_activity_performance_stage WHERE \$CONDITIONS GROUP BY 1" --target-dir /irdw/${env}/legacy/agg/activity_performance/text/shard_${shard}
------------
3881 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Error parsing arguments for import: 3881 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: SUM(click_count) 3881 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: FROM 3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: agg_daily_activity_performance_stage 3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: WHERE 3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: \$CONDITIONS 3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: GROUP 3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: BY 3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 1" 3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --target-dir 3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: /irdw/test/legacy/agg/activity_performance/text/shard_0
Created 09-15-2014 04:04 PM
I was able to get this working. The solution is to submit all of the query elements as separate arguments. Nothing should be in the "Command" window. Instead, starting with "import" as the first argument, enter each part of the query as a separate argument. Properties and values for each element are entered as separate arguments. For example:
arg: import arg: --connect arg: jdbc:mysql.... arg: --username arg: [username] arg: --password-file arg: [password file] arg: --query arg: select ..... arg: --target-dir arg: [target]
The workflow performs as expected.
Michael Reynolds
Created 09-15-2014 04:04 PM
I was able to get this working. The solution is to submit all of the query elements as separate arguments. Nothing should be in the "Command" window. Instead, starting with "import" as the first argument, enter each part of the query as a separate argument. Properties and values for each element are entered as separate arguments. For example:
arg: import arg: --connect arg: jdbc:mysql.... arg: --username arg: [username] arg: --password-file arg: [password file] arg: --query arg: select ..... arg: --target-dir arg: [target]
The workflow performs as expected.
Michael Reynolds
Created 12-14-2017 12:13 AM
MReynolds:thanks a lot! solved my big problem!