Member since
09-24-2015
816
Posts
488
Kudos Received
189
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3124 | 12-25-2018 10:42 PM | |
| 14041 | 10-09-2018 03:52 AM | |
| 4701 | 02-23-2018 11:46 PM | |
| 2420 | 09-02-2017 01:49 AM | |
| 2838 | 06-21-2017 12:06 AM |
09-02-2016
06:39 AM
Okay, after "res" insert this: res1 = foreach (group res BY word) {
tweets = foreach res generate id, user_id, created_time, created_date, text;
generate group as pattern, tweets;
} The inner foreach is to get rid of the "word" associated which each output recored. Try "res2 = group res BY word" to see the difference. And please accept & up-vote the answer.
... View more
09-02-2016
05:27 AM
Can you insert "describe twitter; describe c;" after the CROSS statement, and find the output. If you loaded "twitter" like in your post, twitter::text should be there...
... View more
09-02-2016
04:27 AM
For command line input, as Michael said, check your LANG settings, it
should end in "UTF-8" but it doesn't have to be "en_US", for example for
Japanese it's "ja_JP.UTF-8". As for the Hive view, CVJK languages don't
work and I'm afraid Europian umlauts don't work either. The fix is
coming in Ambari-2.4.
... View more
09-01-2016
03:09 PM
You omitted $0: aa =foreach avg_rate generate $0 as var1,... and then try to store using HCatalog per your original question.
... View more
09-01-2016
12:31 PM
1 Kudo
Okay this may not be optimal but it should work: upload words.txt to a certain directory on hdfs and do this twitter = LOAD 'Twitter.json' .... -- Like in your post
words = LOAD '/user/john/words' as word:chararray;
c = CROSS words, twitter;
res = FILTER c BY (twitter::text MATCHES CONCAT(CONCAT('.*',words::word),'.*'));
And finally dump or store "res" somewhere.
... View more
09-01-2016
07:44 AM
3 Kudos
You can try a direct import using com.ibm.spss.hive.serde2.xml.XmlSerDe. Check https://community.hortonworks.com/content/kbentry/972/hive-and-xml-pasring.html and https://community.hortonworks.com/questions/40979/hive-xml-parising-null-value-returned.html for examples.
... View more
09-01-2016
07:02 AM
Give names to the fields of aa, like for example, ("as var1", "as var2" added): aa = foreach avg_rate generate $0 as var1, (case when detik_rating > 0 then 'positive' when detik_rating < 0 then 'negative' when detik_rating == 0 then 'neutral' else 'null' end) as var2;
... View more
08-31-2016
11:33 AM
It seems something is wrong with hdfs, can you try "hdfs dfs -ls /"? If it doesn't work can you go to HDFS->Configs and check "Name node directories" and "Data node directories". Remove any "unwelcome" members from there, like "/tmp" (Ambari will suggest volumes on all mounting points). Then make sure the remaining directories are owned by user "hdfs". [NN dirs must exist on NN and Secondary NN node, DN dirs must exist on all data nodes.] Finally, restart hdfs, and check NN and DN logs in /var/log/hadoop/hdfs.
... View more
08-31-2016
08:49 AM
The built-in JsonLoader has a somewhat limited functionality and expects all entries (tweets) to have the same order of elements as given in the Pig schema. So, first make sure this condition is satisfied. For example, you have in your schema "id:int" but in the record returned by warnings you don't have an integer element at that position. Also, element names are not preserved, Pig takes them one by one as given in the input, so you can as well name them a, b, c, ... You may also wish to try Elephant Bird JsonLoader which has more advanced features.
... View more
08-29-2016
01:00 PM
You have a 404, not found error by yum. Login to one node where ambari.repo is available and try "yum clean all", followed by "yum repolist" and "yum install ambari-agent". If either of these commands produce error, check and adjust if needed your base URL in ambari.repo, and also make sure that the node (and all other nodes) can access your web server running on lexbz1187.
... View more