Member since
07-19-2016
88
Posts
13
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1871 | 11-03-2016 05:31 PM | |
5764 | 08-22-2016 06:53 PM |
08-25-2018
06:02 AM
@Dan Chaffelson myid file is already present in the required path inside docker container, I doubt there is need to give docker ip's in the connection string, host address properties of zookeeper and nifi properties file. As of now I Have given my host IP with the two ports open on two dockers(8051,8052) so i think it is trying to find it on the host only and not docker containers. Nifi has embedded ZK individually running on each docker instance. Quorum only comes to picture if there is individual ZK nodes set up and we need to specify which is the primary one
... View more
08-09-2018
01:38 PM
Thanks @Shu
... View more
05-24-2018
04:37 PM
@shu tried almost all the thing mentioned above but still no luck
... View more
03-14-2018
08:47 AM
@Shu Appreciate your efforts.
... View more
07-06-2017
06:25 PM
You might consider using the HDP Sandbox on Azure: https://hortonworks.com/hadoop-tutorial/deploying-hortonworks-sandbox-on-microsoft-azure/
... View more
01-26-2017
02:38 PM
Hi @Vaibhav Kumar, If you want to create a bag matching target table's structure, you can do as following: a = load 'file.csv' as PigStorage(',') as (x,y,w);
b = foreach a generate x, y, (int)null as z, w;
describe b;
-- b: {x: int,y: int,z: int,w: int}
... View more
02-01-2017
03:21 AM
1 Kudo
@Vaibhav Kumar
recommendations from my colleagues are valid, you have strings in header row of your CSV documents. You can certainly filter by some known entity but there's a more advanced version of CSV Pig Loader called CSVExcelStorage. It is part of Piggybank library that comes bundled with HDP, hence the register command. You can pass different control parameters to it. Mortar blog is an excellent source of information on working with Pig http://help.mortardata.com/technologies/pig/csv. grunt> register /usr/hdp/current/pig-client/piggybank.jar;
grunt> a = load 'BJsales.csv' using org.apache.pig.piggybank.storage.CSVExcelStorage(',', 'NO_MULTILINE', 'NOCHANGE', 'SKIP_INPUT_HEADER') as (Num:Int,time:int,BJsales:float);
grunt> describe a;
a: {Num: int,time: int,BJsales: float}
grunt> b = limit a 5;
grunt> dump b;
output (1,1,200.1)
(2,2,199.5)
(3,3,199.4)
(4,4,198.9)
(5,5,199.0)
notice I am not filtering any relation, I'm telling the loader to skip header outright, it saves a few key strokes and doesn't waste any cycles processing anything extra.
... View more
11-03-2016
06:09 PM
Glad that it got resolved.
... View more
10-08-2016
07:34 PM
I'm not sure about my data because here even composite keys can produce duplicates. so going by analytical function is not a good choice for me.Any how query is not taking that much time for me. I voted up for your Solution .Thanks For your Response. 🙂 @Constantin Stanca
... View more
10-04-2016
07:34 AM
1 Kudo
Hi Vaibhav, Please go through this link https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup
... View more