Member since
09-15-2015
116
Posts
141
Kudos Received
40
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1751 | 02-05-2018 04:53 PM | |
2264 | 10-16-2017 09:46 AM | |
1983 | 07-04-2017 05:52 PM | |
2957 | 04-17-2017 06:44 PM | |
2171 | 12-30-2016 11:32 AM |
09-10-2016
04:51 PM
What is a task in this context ? reading a line ?
... View more
07-27-2016
07:13 AM
@Sindhu I tried using sqoop import command and it gives me the below error.
... View more
06-25-2018
02:29 PM
If I create a new template, it is creating a flow.xml.gz and working fine. However If I replace the flow.xml.gz with old flow.xml.gz (one backup taken earlier in same cluster node HDF2.X) nifi UI is not coming after login and giving error as "com.sun.jersey.api.client.ClientHandlerException:
java.net.SocketTimeoutException: Read timed out". Have tried with all parameter tuning as described by "Matt Clarke" in some other post but no results. Again If I move old file and replace with new flow.xml.gz then nifi is working fine. Please let me know if anyone faces such issues and probable reason and working around. Thanks, Suman
... View more
07-25-2016
01:45 AM
Thanks for your reply. In your scenario, the flow file need to be splited and then merged.
In Admin Guide: NiFi keeps FlowFile information in memory (the JVM). But during surges of incoming data, NiFi "swaps" the FlowFile information to disk temporarily. I wonder the split and merge procedure will cost performance additionally or not? If so, I think it is better to route or update each lines of context in one flow file.
... View more
06-24-2016
01:37 PM
Thanks for putting this out Simon! We definitely need to make this something people don't need to think about. Ideas welcome.
... View more
05-26-2016
02:34 PM
Wonderful Now it's ok Simon
... View more
05-12-2016
04:16 PM
Thank @Massimiliano Nigrelli for the information, it is helpful.
... View more
02-20-2017
01:26 PM
Well,couldn't make it work... I tried out several options : I successfully got drivers logs in dedicated log file when using following option with my "spark-submit" command line : --driver-java-options "-Dlog4j.configuration=file:///local/home/.../log4j.properties" Couldn't obtain the same with your suggestion : --conf "spark.driver.extraJavaOptions=... For executors' logs, I gave a try with your suggestion as well : --conf "spark.executor.extraJavaOptions=... but failed to notice any change to logging mechanism. I guess this is a classpath issue, but couldn't find any relevant example in the documentation 😞 If I use --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j.properties", where should I put this log4j.properties file ? in the "root" folder of the fat jar that I pass to spark-submit command ? somewhere else ? Note that I also tried with : --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:///local/home/.../log4j.properties" to point to an external file (not in the jar file) but it failed too... Any idea about something wrong in my configuration ? Thanks for your help
... View more
11-10-2015
11:27 PM
1 Kudo
An older version of JPMML is BSD-3 licensed. It supports PMML 3.0, 3.1, 3.2, 4.0 and 4.1.
... View more
11-05-2015
11:27 PM
1 Kudo
@Simon Elliston Ball is right, there's a huge variety of options for NLP as there are many niches for natural language processing. Keep in mind that NLP libraries rarely directly solve business solutions directly. Rather, they give you the tools to build a solution. Often this is segmenting free text into chunks suitable for analysis (e.g. sentence disambiguation), annotating free text (e.g. part of speech tagging), converting free text to a more structured form (e.g. vectorization). All of these are tools that are useful in processing text, but are insufficient by themselves. These tools help you convert free, unstructured text into a form suitable as input into a normal machine learning or analysis pipeline (i.e. classification, etc.). I suppose the one exception to this that I can think of is sentiment analysis..that is a properly valuable analytic in and of itself. Also, keep in mind the license for some of these libraries are not as permissive as Apache (e.g. CoreNLP is GPL with the option to purchase a license for commercial use).
... View more
- « Previous
-
- 1
- 2
- Next »