Member since
07-11-2018
2
Posts
0
Kudos Received
0
Solutions
07-11-2018
11:22 PM
Thanks - I've asked on Streamsets and they've pointed me at a couple of items that have been implemented that will help (preventing the pipelines all starting at service start, for example). Unfortunately, the most promising fix is in Streamsets 3.0.0.0. Given we're running CDH 5.9, we on version 2.3.0.0 currently. Do you know if version 3.0.0.0 is in a later version of CDH? The real reason for the failures appears to be that Streamsets is starting before the HDFS / Hive services that we're using as data targets for the pipelines. If we could change the order of service startup somewhere that would probably help...
... View more
07-11-2018
09:07 AM
Hello! Not sure if this is the right place but... We use Streamsets to load data into a series of databases within our HDFS cluster. However, each time the cluster is restarted, the pipelines all drop into "START_ERROR" state when Streamsets starts - I assume because it's trying to start multiple pipelines on a single Streamsets host at the same time. Is there a way of getting Cloudera to run a script before it stops the Streamsets service? We have the script already as we use it to stop the pipelines ahead of doing any batch processing on the data. Currently we have a manual process to run the script (just a series of curl calls into the Streamsets API) We are running CDH 5.9.0 with Cloudera Manager 5.9 currently. Any advice would be gratefully received. Thanks Ben
... View more
Labels:
- Labels:
-
Cloudera Manager