Member since
09-16-2017
20
Posts
1
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2263 | 12-01-2017 05:05 PM | |
4238 | 12-01-2017 04:59 PM |
12-01-2017
04:59 PM
All - just an update. I was able to get help resolving this on StackOverflow. See the post here: https://stackoverflow.com/questions/47399391/using-nifi-to-pull-elasticsearch-indexes?noredirect=1#comment82139433_47399391
... View more
11-03-2017
06:20 AM
@Charles Bradbury, Glad that the issue is resolved. Can you kindly accept the answer so that community users can quickly find the answer.
... View more
10-31-2017
09:56 PM
Hi @Charles Bradbury For your information, Spark 2.2 is supported in HDP 2.6.3 annonced today : https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.3/bk_release-notes/content/comp_versions.html
... View more
12-01-2017
05:05 PM
All - just an update. The ES-Hadoop connector, as it should be, is something more in the benefit of Elasticsearch, not so much Spark or Hadoop. It will allow me to connect to the Elasticsearch cluster with spark-shell or PySpark. This is great for ad-hoc queries, however, for long term data movement, use Apache NiFi. The setup, if you are interested, can be found via Stackoverflow here, where I got some great help: https://stackoverflow.com/questions/47399391/using-nifi-to-pull-elasticsearch-indexes?noredirect=1#comment82139433_47399391 One issue I ran into was that we have SSL setup on Elasticsearch and while I was referencing that cert (I had to convert the PEM format to JKS, since Hadoop/Spark only understand JKS), it wasn't working. After working with Elasticsearch support, they had me add the CERT to the CACERTS file in my Java installation and everything worked after that. I had to do this on each box in my cluster for Spark/Hadoop if I ran a job across the cluster. If I ran in stand-alone mode, the single box was fine. Either way, this can save you a lot of issues, just add your Elasticsearch CERT to the CACERTS using the keytool.
... View more
10-17-2017
08:47 AM
That seems more reasonable. But if you want to reduce the the port.maxRetries to 250, then better have a spacing of 250. And I think there was a typo, 40000-40031 is 32 ports so you can change it to 40032 if you are using maxRetries to 32 ports. And again, the executor ports will depend on what mode you are running spark on(standalone vs cluster vs client).
... View more
09-17-2017
07:32 AM
@Charles Bradbury It should be frustrating, a simple diagnostic isn't easy just a quick look I saw some incompatibility in 2017-09-1516:58:19,901-StackFeatureVersionInfo:ClusterStack=2.5,ClusterCurrentVersion=None,CommandStack=None,CommandVersion=None->2.5
2017-09-1516:58:19,933-Using hadoop conf dir:/usr/hdp/current/hadoop-client/conf
2017-09-1516:58:19,953- checked_call['rpm -q --queryformat '%{version}-%{release}' hdp-select | sed -e 's/\.el[0-9]//g''] {'stderr': -1}
2017-09-1516:58:19,985- checked_call returned (0,'2.6.0.3-8','') There is a conflict between 2.5 and 2.6.0.3-8 can you validate your hdp.repo in /etc/yum.repo.d/* Make sure you have only the one you intend to install in this case I think 2.6 yum clean all
yum repolist Please revert
... View more