Member since
04-11-2016
471
Posts
325
Kudos Received
118
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2070 | 03-09-2018 05:31 PM | |
2632 | 03-07-2018 09:45 AM | |
2529 | 03-07-2018 09:31 AM | |
4388 | 03-03-2018 01:37 PM | |
2468 | 10-17-2017 02:15 PM |
05-17-2016
06:38 AM
1 Kudo
Hi @Michael Strasser, I am not sure the way you are proposing is the best approach : if you have consecutive flow files entering your process group and if the file is split in individual rows, you may have difficulties to keep things clear since you will have flow files representing rows from different files. It is certainly doable though. However, what I would recommend, at first glance, is to use the ExecuteScript processor and code something in groovy (for example). This way you don't need to split the file, you keep the entire file and you are easily able to reject the whole file if the value is not equal to the row count. You will find a useful post regarding how to use this processor here : http://funnifi.blogspot.fr/2016/02/executescript-processor-hello-world.html Let me know if you need additional details.
... View more
05-04-2016
05:04 PM
Honestly... I don't know 🙂 I'd suggest digging into the logs (ambari, mapreduce, yarn, hive, etc) to find an explanation. But maybe someone else on HCC will have an idea 🙂
... View more
05-04-2016
04:55 PM
1 Kudo
Hi Henry, you may be interested by this article: http://www.wdong.org/wordpress/blog/2015/01/08/spark-on-yarn-where-have-all-my-memory-gone/ The link seems to be dead at the moment (here is a cached version: http://m.blog.csdn.net/article/details?id=50387104)
... View more
05-04-2016
11:38 AM
@Muhammed Yetginbal One option would be to test launching the Pig script from the console (https://wiki.apache.org/pig/RunPig). Using the client (grunt) you could try to execute the same commands. Maybe it will give you more details about what is happening when "running".
... View more
05-04-2016
08:48 AM
You are running the pig script from the pig view in ambari as described in the tutorial? If yes, do you have a chance to get the logs from the pig view?
... View more
05-04-2016
07:27 AM
1 Kudo
@Muhammed Yetginbal Considering your last comment and information provided, I had a look at: https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore#HCatalogLoadStore-HCatStorer Can you confirm that the pre-requisites are met in your case? The table 'riskFactor' exists (with the correct schema)? Besides is Hive up and running? Are you running your script where Hive is installed? are you in a clustered env?
... View more
05-04-2016
06:20 AM
6 Kudos
Hi Andrew, The recommendation is to start with an unsecure cluster and to add levels of protections one by one to allow benchmarking. Overhead will depend of the use of the cluster. The numbers I have are the following: - wire encryption inside the cluster: 2x overhead - data encryption (Ranger KMS): 15%-20% overhead (but I guess it highly depends of what you are encrypting, not sure every single file must be encrypted). - for Kerberos, Knox and Ranger: this is not significant and it depends of the installation and the use (network performance to KDC, number of Knox gateways, etc). regarding Ranger, since rules are "copied" locally for each service this not significant. Hope that helps.
... View more
05-03-2016
07:59 PM
@Muhammed Yetginbal Logs look OK. It is difficult to help. As suggested, I'd recommend you to have a look into /var/log/... directories/files to look for error messages at the time your Pig job was running.
... View more
05-02-2016
03:08 PM
As I said, GetTwitter does not accept incoming relationship. You won't be able to connect ListenHTTP to GetTwitter. Your best option is to "build" your own flow to request Twitter data (example: https://pierrevillard.com/2016/04/12/oauth-1-0a-with-apache-nifi-twitter-api-example/), or to write your own processor using the existing one as an example.
... View more
05-02-2016
09:11 AM
Yes. What do you see in stderr for example? You could also directly access the logs inside your nodes in /var/log/...
... View more