Member since
11-17-2021
1117
Posts
253
Kudos Received
28
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 224 | 10-16-2025 02:45 PM | |
| 477 | 10-06-2025 01:01 PM | |
| 442 | 09-24-2025 01:51 PM | |
| 400 | 08-04-2025 04:17 PM | |
| 480 | 06-03-2025 11:02 AM |
06-16-2023
05:09 AM
1 Kudo
Here are some highlights from the month of May
COMING NEXT!
Streaming Data Pipeline Development
Check out the FY24 Cloudera Meetup Events Calendar for upcoming & past event details!
182 new support questions
14 new community articles
472 new members
Rank
Community Article
Author
Components/ Labels
#1
Get data from Oracle by Apache NiFi , then save to Hive Parquet External Table
Zhen Zeng @zzeng
Apache NiFi
#2
Spark3 Integration Examples
Ranga Reddy @RangaReddy
Apache HBase
Apache Hive
Apache Iceberg
Apache Kafka
Apache Kudu
Apache Ozone
Apache Phoenix
Apache Spark
Cloudera Data Platform (CDP)
#3
Operationalize NiFi data flows with Cloudera DataFlow
Steven Matison @steven-matison
Apache Spark
Cloudera Data Platform (CDP)
#4
CDW - RAZ Enabled : How to add S3 Bucket in Two Steps (That was EASY)
Ryan Cicak @RyanCicak
Apache Ranger
Cloudera Data Warehouse (CDW)
#5
Reading/Writing using R to an external table within CML (Two steps - THAT WAS EASY)
Ryan
Cicak @RyanCicak
Cloudera Data Science Workbench (CDSW)
Cloudera Machine Learning (CML)
We would like to recognize the below community members and employees for their efforts over the last month to provide community solutions.
See all our top participants at Top Solution Authors leaderboard and all the other leaderboards on our Leaderboards and Badges page.
@steven-matison @MattWho @rki_ @Bharati @smdas @cotopaul @SAMSAL @SandyClouds @BrianChan @banshidhar_saho
Share your expertise and answer some of the below open questions. Also, be sure to bookmark the unanswered question page to find additional open questions.
Unanswered Community Post
Components/ Labels
where can I download the source code or jars of HDP3.1.5
Hortonworks Data Platform (HDP)
YARN Resource Manager API is failing post kerberization
Cloudera Manager
Kerberos
Hue Query Processor - Change Log Location
Cloudera Data Platform Private Cloud (CDP-Private) Cloudera Hue
Error HDFS Service Check
Hortonworks Data Platform (HDP)
Count of success and failed records
Apache NiFi
... View more
06-14-2023
10:00 PM
What are you using Vault for specifically? Retrieving secrets for some other purpose? Anytime the built in processors don't or can't do what I need or want I've found scripted processors to be ideal.
... View more
06-14-2023
02:01 PM
Hi Yuexin, you have been very helpful. Unfortunately, if I wanted to use "Dynamic Queue Scheduling" in CDP 717 at the moment, I would not have the guarantee to solve any problems via Cloudera support. In fact, it is not recommended to use it in production. Thank you very much
... View more
06-14-2023
12:33 PM
I would do this in a single step with a InvokeScriptedProcessor and the following Groovy code import groovy.json.JsonOutput
import groovy.json.JsonSlurper
import java.nio.charset.StandardCharsets
import org.apache.commons.io.IOUtils
class GroovyProcessor implements Processor {
PropertyDescriptor BATCH_SIZE = new PropertyDescriptor.Builder()
.name("BATCH_SIZE")
.displayName("Batch Size")
.description("The number of incoming FlowFiles to process in a single execution of this processor.")
.required(true)
.defaultValue("1000")
.addValidator(StandardValidators.POSITIVE_INTEGER_VALIDATOR)
.build()
Relationship REL_SUCCESS = new Relationship.Builder()
.name("success")
.description('FlowFiles that were successfully processed are routed here')
.build()
Relationship REL_FAILURE = new Relationship.Builder()
.name("failure")
.description('FlowFiles that were not successfully processed are routed here')
.build()
ComponentLog log
void initialize(ProcessorInitializationContext context) { log = context.logger }
Set<Relationship> getRelationships() { return [REL_FAILURE, REL_SUCCESS] as Set }
Collection<ValidationResult> validate(ValidationContext context) { null }
PropertyDescriptor getPropertyDescriptor(String name) { null }
void onPropertyModified(PropertyDescriptor descriptor, String oldValue, String newValue) { }
List<PropertyDescriptor> getPropertyDescriptors() { Collections.unmodifiableList([BATCH_SIZE]) as List<PropertyDescriptor> }
String getIdentifier() { null }
JsonSlurper jsonSlurper = new JsonSlurper()
JsonOutput jsonOutput = new JsonOutput()
void onTrigger(ProcessContext context, ProcessSessionFactory sessionFactory) throws ProcessException {
ProcessSession session = sessionFactory.createSession()
try {
List<FlowFile> flowFiles = session.get(context.getProperty(BATCH_SIZE).asInteger())
if (!flowFiles) return
flowFiles.each { flowFile ->
Map customAttributes = [ "mime.type": "application/json" ]
List data = null
session.read(flowFile, {
inputStream -> data = jsonSlurper.parseText(IOUtils.toString(inputStream, StandardCharsets.UTF_8))
} as InputStreamCallback)
data.each { entry ->
entry.VisitList.each { visit ->
Map newData = [:]
newData.put("employer", entry.employer)
newData.put("loc_id", entry.loc_id)
newData.put("topId", entry.topId)
newData.put("VisitList", [visit])
FlowFile newFlowFile = session.create()
newFlowFile = session.write(newFlowFile, { outputStream -> outputStream.write(jsonOutput.toJson([newData]).getBytes(StandardCharsets.UTF_8)) } as OutputStreamCallback)
newFlowFile = session.putAllAttributes(newFlowFile, customAttributes)
session.transfer(newFlowFile, REL_SUCCESS)
}
}
session.remove(flowFile)
}
session.commit()
} catch (final Throwable t) {
log.error('{} failed to process due to {}; rolling back session', [this, t] as Object[])
session.rollback(true)
throw t
}
}
}
processor = new GroovyProcessor()
... View more
06-14-2023
07:10 AM
@Fredb This is a very difficult one to solve. Does anyone know what would cause the execution of the sample_Import_Load.bat to run correctly from the windows command prompt, but fail when executed via the ExecuteStreamCommand processor with these errors? This is most likely caused by permission issues. Nifi requires specific permissions against files and scripts it touches or executes from within processors. As such, the error is saying the processor does not know where any of the resources exist to run that .bat file. I do not have any experience with nifi on windows, other than to avoid it, but the solution is likely the same as other operating systems. Make sure the nifi user has full ownership of the file(s). Additionally, it is sometimes possible to find deeper errors looking at the nifi-app.log file while testing and/or setting the log level of the processor to be more aggressive.
... View more
06-14-2023
01:06 AM
Fun fact for those interested: In order to have 8 cores running you need in this example minimum 64m as xss options. If you chose 32m, then it will not give a stackoverflow error, but only 4 cores will be running 😲
... View more
06-12-2023
05:06 PM
@Motimot Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.
... View more
06-09-2023
04:11 PM
1 Kudo
Hey @steven-matison and @Edenwheeler thank you so much for your help It worked with StandardProxyConfigurationService controller services however I still have issues with StandardRestrictedSSLContextService controller service. Anyway, thank you so much for the help and details steps that helped me a lot. Thank you!!
... View more
06-09-2023
11:59 AM
@Josh2023 Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.
... View more
06-08-2023
08:08 AM
@JeffB Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.
... View more