Thanks - I've asked on Streamsets and they've pointed me at a couple of items that have been implemented that will help (preventing the pipelines all starting at service start, for example). Unfortunately, the most promising fix is in Streamsets 22.214.171.124. Given we're running CDH 5.9, we on version 126.96.36.199 currently. Do you know if version 188.8.131.52 is in a later version of CDH? The real reason for the failures appears to be that Streamsets is starting before the HDFS / Hive services that we're using as data targets for the pipelines. If we could change the order of service startup somewhere that would probably help...
... View more
Hello! Not sure if this is the right place but... We use Streamsets to load data into a series of databases within our HDFS cluster. However, each time the cluster is restarted, the pipelines all drop into "START_ERROR" state when Streamsets starts - I assume because it's trying to start multiple pipelines on a single Streamsets host at the same time. Is there a way of getting Cloudera to run a script before it stops the Streamsets service? We have the script already as we use it to stop the pipelines ahead of doing any batch processing on the data. Currently we have a manual process to run the script (just a series of curl calls into the Streamsets API) We are running CDH 5.9.0 with Cloudera Manager 5.9 currently. Any advice would be gratefully received. Thanks Ben
... View more