Created on 09-03-2022 11:39 AM - last edited on 09-03-2022 03:00 PM by ask_bill_brooks
Hello all,
I am a student taking a course in SQL. The final module of this course has us installing Oracle VMWare and Cloudera to utilize Hadoop.
I went through a detailed instruction manual for the VMWare installation, including an expansion pack install and guest account install within the VMWare environment. I ran into no problems with this installation process (linked below for reference)
Installation_ClouderaQuickstartVirtualMachine
Afterwards, we were to follow a Cloudera basic tutorial guide to get started (linked below).
Cloudera Quickstart Beginners Tutorial
However, when attempting to input the first script into Terminal as provided, I began running into errors. The script and errors received are shown below.
[cloudera@quickstart ~]$ sqoop import-all-tables \
> m 1 \
> connect jdbc:mysql://quickstart:3306/retail_db \
> username=retail_dba \
> password=cloudera \
> compression-codec=snappy \
> as-parquetfile \
> warehouse-dir=/user/hive/warehouse \
> hive-import
Initial errors received:
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
22/09/02 17:02:00 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.0
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Error parsing arguments for import-all-tables:
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument:
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument: -m
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument: 1
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument:
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument: --connect
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument: jdbc:mysql://quickstart:3306/retail_db
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument:
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument: --username
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument: retail_dba
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument:
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument: --password
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument: cloudera
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument:
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument: --compression-codec
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument: snappy
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument:
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument: --as-parquetfile
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument:
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument: --warehouse-dir
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument: /user/hive/warehouse
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument:
22/09/02 17:02:00 ERROR tool.BaseSqoopTool: Unrecognized argument: --hive-import
I found a post on this forum indicating I need to set the Accumulo default directory for the script to use. I ran the following script as directed.
sudo mkdir /var/lib/accumulo
ACCUMULO_HOME='/var/lib/accumulo'
export ACCUMULO_HOME
Running this script appeared to remove the initial error I was receiving, regarding the directory, but I am still receiving errors, shown below.
[cloudera@quickstart ~]$ sqoop import-all-tables \
> m 1 \
> connect jdbc:mysql://quickstart:3306/retail_db \
> username=retail_dba \
> password=cloudera \
> compression-codec=snappy \
> as-parquetfile \
> warehouse-dir=/user/hive/warehouse \
> hive-import
22/09/03 09:54:35 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.0
22/09/03 09:54:35 ERROR tool.BaseSqoopTool: Error parsing arguments for import-all-tables:
22/09/03 09:54:35 ERROR tool.BaseSqoopTool: Unrecognized argument: m
22/09/03 09:54:35 ERROR tool.BaseSqoopTool: Unrecognized argument: 1
22/09/03 09:54:35 ERROR tool.BaseSqoopTool: Unrecognized argument: connect
22/09/03 09:54:35 ERROR tool.BaseSqoopTool: Unrecognized argument: jdbc:mysql://quickstart:3306/retail_db
22/09/03 09:54:35 ERROR tool.BaseSqoopTool: Unrecognized argument: username=retail_dba
22/09/03 09:54:35 ERROR tool.BaseSqoopTool: Unrecognized argument: password=cloudera
22/09/03 09:54:35 ERROR tool.BaseSqoopTool: Unrecognized argument: compression-codec=snappy
22/09/03 09:54:35 ERROR tool.BaseSqoopTool: Unrecognized argument: as-parquetfile
22/09/03 09:54:35 ERROR tool.BaseSqoopTool: Unrecognized argument: warehouse-dir=/user/hive/warehouse
22/09/03 09:54:35 ERROR tool.BaseSqoopTool: Unrecognized argument: hive-import
Try --help for usage instructions.
I tried reaching out to my professor regarding these issues, but his only response was to follow the guide and scripting provided... Hoping someone here can help
Best regards!
Created 09-03-2022 12:05 PM
Apologies all, I think that I figured it out. I had been pasting the script directly as copied from the PDF, which didn't work. Then I tried entering it in manually without the -- in front of each line, but apparently those are requirements. I entered each line separately, with the -'s included, and the script ran, Sqoop was initiated.
Created 09-03-2022 12:05 PM
Apologies all, I think that I figured it out. I had been pasting the script directly as copied from the PDF, which didn't work. Then I tried entering it in manually without the -- in front of each line, but apparently those are requirements. I entered each line separately, with the -'s included, and the script ran, Sqoop was initiated.