Member since
10-01-2015
3933
Posts
1150
Kudos Received
374
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3485 | 05-03-2017 05:13 PM | |
| 2873 | 05-02-2017 08:38 AM | |
| 3123 | 05-02-2017 08:13 AM | |
| 3089 | 04-10-2017 10:51 PM | |
| 1578 | 03-28-2017 02:27 AM |
09-24-2016
02:20 PM
Zeppelin support was only added in HDP 2.5, please remove the service and consider upgrading ambari to 2.4.1 and HDP to 2.5.
... View more
06-05-2017
07:57 PM
Hi currently I am using eclipse jee neon IDE and i have configured apache Pig with it. Based upon the instructions i have created hadoop file system and uploaded some data files into it. I have also written a Pig script to run some codes. But i am unable to run the script as its showing an exception 'unable to launch'. Please suggest any workaround to fix this issue. Below is the snapshot of the same.pig-eclipse.png
... View more
09-06-2016
04:28 PM
Metrics in https://issues.apache.org/jira/browse/HDFS-3170 satisfy the use case
... View more
09-06-2016
06:14 PM
1 Kudo
Kafka MirrorMaker is designed for the sole purpose of replicating kafka's topi cdata from one data center to another. Pros: 1. Simple to setup 2. Uses Kafka's produce and consumer api. Makes it easier to enable wire-encryption(SSL) and Keberos (Nifi can offer the same as they both use the same API). 3. Designed to replicate all the topics in source to target data center . Users can also choose and pick specific topic if they desired so. Cons: 1. Hard to monitor. As the mirror maker is just a JVM process ,provisioning and monitoring the mirror maker process can be hard. One need to monitor the metrics coming from mirrormaker to see if there is any lag or no data being produced into target cluster. 2. MirrorMakers won't keep the origin Kafka topic offsets into target cluster ( Nifi or any other solution will run into the same limitation). As writing a new message into the target data center creates a new offset.
... View more
01-16-2018
03:14 PM
With Ambari 2.6 I was able to do this Find steps here http://www.v3bigdata.in/2018/01/hdp-multi-os-type-and-version-support.html
... View more
09-07-2016
07:35 PM
@Artem Ervits: Release Eng has copied the 2.3.6 Companion files to the correct location.
... View more
01-07-2017
05:38 AM
Thank you very much
... View more
08-20-2016
02:20 PM
refer to manual installation doc for hdp-select to fix your symlink issues https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.0/bk_upgrading_Ambari/content/_Run_HDP_Select_mamiu.html when you have a specific error open a question, generally you shouldn't get these errors.
... View more
08-24-2016
08:21 PM
2 Kudos
For maintenance mode, you could always tun off the maintenance mode and enable the service manually. Some of the components are best left off when you don't really need them because some of them are really resource hogs. Generally, HDFS, MapReduce2, YARN and Hive should be green the makes most things working.
... View more
08-08-2016
04:54 PM
2 Kudos
UPDATE: I'm happy to report that my patch for PIG-4931 was accepted and merged to trunk. I was browsing through Apache Pig Jiras and stumbled on Jira https://issues.apache.org/jira/browse/PIG-4931 requiring to document Pig "IN" operator. Turns out Pig had IN operator since days of 0.12 and no one had a chance
to document it yet. The associated JIRA is https://issues.apache.org/jira/browse/PIG-3269. In this short article I will go over the IN operator and until I'm able to submit a patch to close
out the ticket this should serve as its documentation. Now, IN operator in Pig works like in SQL. You provide a
list of fields and it will return just those rows. It is a lot more useful than for example a = LOAD '1.txt' USING PigStorage(',') AS (i:int);
b = FILTER a BY
(i == 1) OR
(i == 22) OR
(i == 333) OR
(i == 4444) OR
(i == 55555); You can rewrite the same statement as a = LOAD '1.txt' USING PigStorage(',') AS (i:int);
b = FILTER a BY i IN (1,22,333,4444,55555); The best thing about it is that it accepts more than just Integers, you can pass float, double, BigDecimal, BigInteger, bytearray and String.
Let's review each one in detail grunt> fs -cat data;
1,Christine,Romero,Female
2,Sara,Hansen,Female
3,Albert,Rogers,Male
4,Kimberly,Morrison,Female
5,Eugene,Baker,Male
6,Ann,Alexander,Female
7,Kathleen,Reed,Female
8,Todd,Scott,Male
9,Sharon,Mccoy,Female
10,Evelyn,Rice,Female Passing an integer to IN clause A = load 'data' using PigStorage(',') AS (id:int, first:chararray, last:chararray, gender:chararray);
X = FILTER A BY id IN (4, 6);
dump X;
(4,Kimberly,Morrison,Female)
(6,Ann,Alexander,Female) Passing a String A = load 'data' using PigStorage(',') AS (id:chararray, first:chararray, last:chararray, gender:chararray);
X = FILTER A BY id IN ('2', '4', '8');
dump X;
(2,Sara,Hansen,Female)
(4,Kimberly,Morrison,Female)
(8,Todd,Scott,Male) Passing a ByteArray A = load 'data' using PigStorage(',') AS (id:bytearray, first:chararray, last:chararray, gender:chararray);
X = FILTER A BY id IN ('1', '9');
dump X;
(1,Christine,Romero,Female)
(9,Sharon,Mccoy,Female) Passing a BigInteger and using NOT operator, thereby negating the passed list of fields in the IN clause A = load 'data' using PigStorage(',') AS (id:biginteger, first:chararray, last:chararray, gender:chararray);
X = FILTER A BY NOT id IN (1, 3, 5, 7, 9);
dump X;
(2,Sara,Hansen,Female)
(4,Kimberly,Morrison,Female)
(6,Ann,Alexander,Female)
(8,Todd,Scott,Male)
(10,Evelyn,Rice,Female) Now I understand that most cool kids these days are using Spark; I strongly believe Pig has a place in any Big Data stack and it's livelihood depends on comprehensive and complete documentation. Happy learning!
... View more
Labels: