Member since
11-22-2015
40
Posts
18
Kudos Received
0
Solutions
06-01-2016
04:41 AM
Thanks @Harsh J, Appreciate your help. Just as a feedback on part of Cloudera Documentation should also include an example with Sqoop1 and Sqoop2 and that would have been very helpful too for all the users who deploy Sqoop.
... View more
05-27-2016
03:55 AM
Thanks Michalis, am quite new to sqoop usage. I have installed Sqoop1 Client and Sqoop2 using Cloudera Manager. Am able to use "Sqoop2" command and it goes into sqoop prompt like below: [hduser@node1 ~]$ sqoop2
Sqoop home directory: /opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/sqoop2
Sqoop Shell: Type 'help' or '\h' for help.
sqoop:000> But When I use, "Sqoop command", am getting the following message and doesn't enter into sqoop prompt: [hduser@node1 ~]$ sqoop
Warning: /opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Try 'sqoop help' for usage.
[hduser@node1 ~]$ Checked the sqoop version and it says 1.4.5 : [hduser@node1 ~]$ sqoop version
Warning: /opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/05/27 16:20:13 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.4.1
Sqoop 1.4.5-cdh5.4.1
git commit id
Compiled by jenkins on Thu May 7 22:45:52 PDT 2015
[hduser@node1 ~]$ And I wanted to use Sqoop1 to import data from Oracle to HDFS/Hive. The reason I heard is Sqoop2 is not production ready and was advised to configure Sqoop1 as per architects. Here am little confused, like which version of Sqoop am I using? Does Sqoop2 command also refers to sqoop1? How do I use Sqoop1 specifically?
... View more
05-24-2016
08:10 AM
I have installed Sqoop 1 Client as per your given document but I have an issue and Sqoop 1 client is not starting. I have raised a separate post for it. Can you please look into it as well and help. http://community.cloudera.com/t5/Data-Ingestion-Integration/Sqoop-1-Client-not-started/m-p/41188
... View more
05-23-2016
10:37 AM
I have a 2 Node YARN cluster using Cloudera manager 5.4.2 and by default sqoop 2 is installed. I wanted to use Sqoop 1 and I have installed Sqoop 1 client by following the cloudera documenatation as per below link: http://www.cloudera.com/documentation/enterprise/5-4-x/topics/cm_mc_sqoop1_client.html http://www.cloudera.com/documentation/enterprise/5-4-x/topics/cm_mc_add_service.html Once the Sqoop 1 client is installed and I have deployed client configuration, restarted all services and also infact restarted entire cluster but still I see Sqoop 1 client is always in "None" state. Kindly Help For reference, I have the screen capture from Cloudera Manager console for Sqoop 1:
... View more
Labels:
- Labels:
-
Apache Sqoop
-
Apache YARN
-
Cloudera Manager
05-23-2016
07:45 AM
Yes the link was helpful. As per the property "dfs.datanode.du.reserved", it was configured to use 4.25 GB and hence I consider now that 4.25 GB is allocated for each data directory in a given node. Since I had two data directory partitions, the reserved space combined would be 8.5 GB per node and which brings the configured capacity on each node to be 23.5 GB (32GB - 8.5GB). I arrived at the following formula === > Configured Capacity = Total Disk Space allocated for Data Directories (dfs.data.dir) - Reserved Space for Non DFS Use (dfs.datanode.du.reserved)
... View more
05-16-2016
02:45 AM
Hi Vina, As you can see from the output, sdb2 and sdc2 are allocated for nonhdfsstorgae (ex: intermediate data). sdb1 and sdc1 are the partition drives which are mounted for hdfs storage and they are of 16GB each as you can see in "df -h" output. [hduser@node1 ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 16G 283M 15G 2% /disks/disk1/hdfsstorage/dfs
/dev/sdc1 16G 428M 15G 3% /disks/disk2/hdfsstorage/dfs Can you please help.
... View more
05-15-2016
06:05 AM
I have implemented 2 node cluster using Cloudera Manager 5.4.1 in VMWare workstation and this includes components like Hbase, Impala, Hive, Sqoop2, Oozie, Zookeeper, NameNode, SecondaryName and YARN. I have simulated 3 disk drives per node which includes sda for OS , sdb & sdc for Hadoop storage. As I had allocated sdb1 having 16GB and sdc1 having 16GB dedicated for Hadoop storage on each of the nodes. Hence I assume that my total capacity for HDFS storage including both nodes should be 64GB. But when checked the output using dfsadmin command and also using NameNode UI, I see that the "Configured Capacity is lesser than my original disk size allocated for HDFS". I have shown the output of dfsadmin command below and also output of df -h is shown. Kindly help me understand why the Configured capacity is showing lesser than my original disk size ? [hduser@node1 ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_node1-LogVol00 40G 15G 23G 39% /
tmpfs 3.9G 76K 3.9G 1% /dev/shm
/dev/sda1 388M 39M 329M 11% /boot
/dev/sdb1 16G 283M 15G 2% /disks/disk1/hdfsstorage/dfs
/dev/sdc1 16G 428M 15G 3% /disks/disk2/hdfsstorage/dfs
/dev/sdb2 8.1G 147M 7.9G 2% /disks/disk1/nonhdfsstorage
/dev/sdc2 8.1G 147M 7.9G 2% /disks/disk2/nonhdfsstorage
cm_processes 3.9G 5.8M 3.9G 1% /var/run/cloudera-scm-agent/process
[hduser@node1 ~]$ [hduser@node1 zookeeper]$ sudo -u hdfs hdfs dfsadmin -report
[sudo] password for hduser:
Configured Capacity: 47518140008 (44.25 GB)
Present Capacity: 47518140008 (44.25 GB)
DFS Remaining: 46728742571 (43.52 GB)
DFS Used: 789397437 (752.83 MB)
DFS Used%: 1.66%
Under replicated blocks: 385
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (2):
Name: 192.168.52.111:50010 (node1.example.com)
Hostname: node1.example.com
Rack: /default
Decommission Status : Normal
Configured Capacity: 23759070004 (22.13 GB)
DFS Used: 394702781 (376.42 MB)
Non DFS Used: 0 (0 B)
DFS Remaining: 23364367223 (21.76 GB)
DFS Used%: 1.66%
DFS Remaining%: 98.34%
Configured Cache Capacity: 121634816 (116 MB)
Cache Used: 0 (0 B)
Cache Remaining: 121634816 (116 MB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 2
Last contact: Sun May 15 18:15:33 IST 2016
Name: 192.168.52.112:50010 (node2.example.com)
Hostname: node2.example.com
Rack: /default
Decommission Status : Normal
Configured Capacity: 23759070004 (22.13 GB)
DFS Used: 394694656 (376.41 MB)
Non DFS Used: 0 (0 B)
DFS Remaining: 23364375348 (21.76 GB)
DFS Used%: 1.66%
DFS Remaining%: 98.34%
Configured Cache Capacity: 523239424 (499 MB)
Cache Used: 0 (0 B)
Cache Remaining: 523239424 (499 MB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 2
Last contact: Sun May 15 18:15:32 IST 2016
... View more
05-14-2016
11:43 AM
Thanks Michalis.
... View more
05-10-2016
03:25 AM
I have installed Cloudera Manager 5.4.1 version on my 4 node cluster and while installing the Express Edition (Free Edition), I see only Sqoop 2 included for installation. Sqoop 1 is not found and please explain ways for installing Sqoop 1 with my CDH 5.4.1 cluster.
... View more
Labels: