Member since
10-01-2015
3933
Posts
1150
Kudos Received
374
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3565 | 05-03-2017 05:13 PM | |
| 2939 | 05-02-2017 08:38 AM | |
| 3189 | 05-02-2017 08:13 AM | |
| 3155 | 04-10-2017 10:51 PM | |
| 1627 | 03-28-2017 02:27 AM |
08-11-2016
12:52 PM
Hi Biswajit, are you aware that MRUnit is pretty much dead project? There is no more development done on it. that said, I found an example here http://stackoverflow.com/questions/15674229/mapreduce-unit-test-fails-to-mock-distributedcache-getlocalcachefiles
... View more
08-11-2016
12:01 AM
Do you have support access? Seems like something is really broken on your end. In absence of support, I'd try to figure out why your Python is broken. Try to enter "python" command, does that work? If all else fails, I'd try to reinstall ambari-agent, if that fails, I'd upgrade Ambari server and agents.
... View more
08-10-2016
12:12 PM
1 Kudo
Take a look at this article, it has ways of setting compression, including zlib in Hive. http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/ It will help if you specify which product specifically you're trying to enable zlib for. Since you categorized the question in data ingestion, I will assume it's for Sqoop, here's an example how to Sqoop using compression, just replace snappy codec class with zlib https://community.hortonworks.com/questions/29648/sqoop-import-to-hive-with-compression.html
... View more
08-09-2016
03:52 PM
@Mourad Chahri can you run the following command repoquery --requires --resolve ambari-agent | grep rpm-python
and also provide the version of ambari-agent and server you're running rpm -qa | grep ambari if you're missing rpm-python module then you can install it using yum install rpm-python I am curious though how you have agent running and missing this module?
... View more
08-09-2016
02:55 PM
@Alexander Feldman take a look at this article by IBM on the topic, https://developer.ibm.com/hadoop/2016/08/08/enterprise-alerting-big-sql-using-ambari-custom-dispatcher/
... View more
08-08-2016
04:54 PM
2 Kudos
UPDATE: I'm happy to report that my patch for PIG-4931 was accepted and merged to trunk. I was browsing through Apache Pig Jiras and stumbled on Jira https://issues.apache.org/jira/browse/PIG-4931 requiring to document Pig "IN" operator. Turns out Pig had IN operator since days of 0.12 and no one had a chance
to document it yet. The associated JIRA is https://issues.apache.org/jira/browse/PIG-3269. In this short article I will go over the IN operator and until I'm able to submit a patch to close
out the ticket this should serve as its documentation. Now, IN operator in Pig works like in SQL. You provide a
list of fields and it will return just those rows. It is a lot more useful than for example a = LOAD '1.txt' USING PigStorage(',') AS (i:int);
b = FILTER a BY
(i == 1) OR
(i == 22) OR
(i == 333) OR
(i == 4444) OR
(i == 55555); You can rewrite the same statement as a = LOAD '1.txt' USING PigStorage(',') AS (i:int);
b = FILTER a BY i IN (1,22,333,4444,55555); The best thing about it is that it accepts more than just Integers, you can pass float, double, BigDecimal, BigInteger, bytearray and String.
Let's review each one in detail grunt> fs -cat data;
1,Christine,Romero,Female
2,Sara,Hansen,Female
3,Albert,Rogers,Male
4,Kimberly,Morrison,Female
5,Eugene,Baker,Male
6,Ann,Alexander,Female
7,Kathleen,Reed,Female
8,Todd,Scott,Male
9,Sharon,Mccoy,Female
10,Evelyn,Rice,Female Passing an integer to IN clause A = load 'data' using PigStorage(',') AS (id:int, first:chararray, last:chararray, gender:chararray);
X = FILTER A BY id IN (4, 6);
dump X;
(4,Kimberly,Morrison,Female)
(6,Ann,Alexander,Female) Passing a String A = load 'data' using PigStorage(',') AS (id:chararray, first:chararray, last:chararray, gender:chararray);
X = FILTER A BY id IN ('2', '4', '8');
dump X;
(2,Sara,Hansen,Female)
(4,Kimberly,Morrison,Female)
(8,Todd,Scott,Male) Passing a ByteArray A = load 'data' using PigStorage(',') AS (id:bytearray, first:chararray, last:chararray, gender:chararray);
X = FILTER A BY id IN ('1', '9');
dump X;
(1,Christine,Romero,Female)
(9,Sharon,Mccoy,Female) Passing a BigInteger and using NOT operator, thereby negating the passed list of fields in the IN clause A = load 'data' using PigStorage(',') AS (id:biginteger, first:chararray, last:chararray, gender:chararray);
X = FILTER A BY NOT id IN (1, 3, 5, 7, 9);
dump X;
(2,Sara,Hansen,Female)
(4,Kimberly,Morrison,Female)
(6,Ann,Alexander,Female)
(8,Todd,Scott,Male)
(10,Evelyn,Rice,Female) Now I understand that most cool kids these days are using Spark; I strongly believe Pig has a place in any Big Data stack and it's livelihood depends on comprehensive and complete documentation. Happy learning!
... View more
Labels:
08-06-2016
03:04 PM
Please post logs.
... View more
08-04-2016
07:04 PM
@pankaj chaturvedi @Sunile Manjee apparently the command is incorrect, typically set commands accepts key and value, there is no equal sign. So if this were to work it would be SET exectype 'tez'; but it does not work in 2.4 or 2.5. I am following up internally whether this is a bug or feature.
... View more
08-03-2016
07:14 PM
@Jaime the database flavor has nothing to do with it, it's a matter of Sqoop having that functionality, which at this moment it does not.
... View more