Member since
02-12-2016
102
Posts
117
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
13370 | 03-15-2016 06:36 AM | |
15457 | 03-12-2016 10:04 AM | |
3271 | 03-12-2016 08:14 AM | |
984 | 03-04-2016 02:36 PM | |
1852 | 02-19-2016 10:59 AM |
01-27-2020
12:16 PM
Thanks for the information. In using this command, it did cause some serious performance degradation when writing to HDFS. Every 128MB block would take about 20-30 secs to write to HDFS. The issue had to do with trying to compress the tar file. It's better to remove the "z" flag in tar and not compress. Just to provide some numbers, writing almost 1TB of data from local disk to HDFS would take 13+ hours with compression (z) and it would actually eventually fail due to kerberos ticket expiration. Removing the "z" flag, the copy to HDFS took less than an hour for the same 1TB of data!
... View more
10-27-2016
07:21 AM
Hi Neeraj Sabharwal, @Neeraj Sabharwal @Rushikesh Deshmukh This are the steps i followed for incremental import in sqoop for hbase table. Step 1: Importing a Table To HBase sqoop import --connect "jdbc:sqlserver://x.x.x.x:1433;database=test" --username sa -P --table employee --hbase-table employee --hbase-create-table --column-family cf --hbase-row-key id -m 1 Step 2: SQOOP HBASE INCREMENTAL IMPORT sqoop import --connect "jdbc:sqlserver://x.x.x.x:1433;database=test" --username sa -P --table employee --incremental append --check-column id --last-value 71 -m 1 Step 3: SQOOP JOB CREATION FOR HBASE INCREMENT sqoop job --create incjobsnew -- import --connect "jdbc:sqlserver://x.x.x.x:1433;database=test" --username sa -P --table employee --incremental append --check-column id --last-value 71 -m 1. When i execute sqoop job sqoop job --exec incjobsnew. Sqoop command runs successfully and it show the exact number of records retrieved successfully. When i check in hbase for the records. It doesn't show the retrieved results. Could you tell where is the mistake done. I need to automate this sqoop job in Oozie to run a particular time interval daily.
... View more
09-21-2016
05:01 PM
3 Kudos
@Rushikesh Deshmukh Flatten un-nests tuples as well as bags. consider a relation that has a tuple of the form (a, (b, c)). The expression GENERATE $0, flatten($1), will cause that tuple to become (a, b, c). You can refer to the below link to know more and have better understanding of other operators, just in case if you need them. https://www.qubole.com/resources/cheatsheet/pig-function-cheat-sheet/
... View more
06-09-2017
03:11 PM
Does these configuration mentioned in this page work on TEZ engine .I could see SMB working only on MR
... View more
03-12-2016
09:05 AM
Sure, please accept the answer is satisfied
... View more
03-12-2016
09:04 AM
1 Kudo
@Artem Ervits, thanks for sharing this link.
... View more
03-07-2016
05:58 AM
@Neeraj Sabharwal, got the required answer, thus closing this thread.
... View more
03-04-2016
11:31 AM
here's an example, file type doesn't matter as everything is bytes. You can the ingest csv with Hive, pig or spark. http://www.lampdev.org/programming/hadoop/apache-flume-spooldir-sink-tutorial.html
... View more
03-04-2016
02:36 PM
2 Kudos
I have received below answer: Control the start action using a decision control node as the default start action.
Using Case in decision control node, it is possible to divert to needed action based on your parameter. Want to know if it work.
... View more
02-23-2016
12:45 AM
1 Kudo
I was successful in executing a MapReduce Job. Since the method Job.setBy.JarName(WordCount.class) was missing it was unable to find out the Mapper class. Thanks!!!
... View more