About cjervis

cjervis · ‎06-29-2023

Welcome to the community @chenjun. While you wait for someone more experience than me, I wanted to add a quick google translation of your post's subject in case if increases your ability to find help. Submitting the sparksql task through Livy actually reports a kafka kerberos error, please help!

cjervis · ‎06-29-2023

Welcome to the community @harry_su . As this question is several years old, I would suggest starting a new one. This would allow you to add details specific to your situation.

ihsancilingir · ‎06-27-2023

thanks it solved the problem

cjervis · ‎06-26-2023

Welcome to the community @Arui. I see you are facing an error copying installation files when adding a new host to a cluster (if google translate is correct). I'll refer you to this post which may explain the reason.

Arui · ‎06-26-2023

@Gcima009

cjervis · ‎06-23-2023

Welcome to the community @sencae While you wait for a more knowledgable person to respond, I did find this older post that hopefully gets you closer to where you need to be. https://community.cloudera.com/t5/Support-Questions/How-to-retrieve-Latest-Uploaded-records-from-Hive-In-my/td-p/187070

cjervis · ‎06-23-2023

Welcome to the community @samans I'm not an expert but I did some searching and may have found something for you. I would review the Cloudera ODBC Connector for Apache Impala documentation as I see some references to transaction statements not being supported and how to work around it. Not sure if that is the issue here, but hope it helps.

yagoaparecidoti · ‎06-22-2023

hi @rki_ / @cjervis , I forgot to ask, today the cluster already has thousands of blocks in hdfs, more than 23 million blocks. after configuring the rack in the cluster, the hdfs will recognize the racks and will start moving the blocks to the racks to increase the availability of the blocks or will i have to rebalance the hdfs?

joseomjr · ‎06-15-2023

This simple InvokeScriptedProcessor will look for a FlowFile attribute called "ip_address" and will attempt the reverse lookup and create a new attribute called "host_name" with the resolved value. import groovy.json.JsonOutput import groovy.json.JsonSlurper import java.net.InetAddress import java.net.UnknownHostException import java.nio.charset.StandardCharsets import org.apache.commons.io.IOUtils class GroovyProcessor implements Processor { PropertyDescriptor BATCH_SIZE = new PropertyDescriptor.Builder() .name("BATCH_SIZE") .displayName("Batch Size") .description("The number of incoming FlowFiles to process in a single execution of this processor.") .required(true) .defaultValue("1000") .addValidator(StandardValidators.POSITIVE_INTEGER_VALIDATOR) .build() Relationship REL_SUCCESS = new Relationship.Builder() .name("success") .description('FlowFiles that were successfully processed are routed here') .build() Relationship REL_FAILURE = new Relationship.Builder() .name("failure") .description('FlowFiles that were not successfully processed are routed here') .build() ComponentLog log void initialize(ProcessorInitializationContext context) { log = context.logger } Set<Relationship> getRelationships() { return [REL_FAILURE, REL_SUCCESS] as Set } Collection<ValidationResult> validate(ValidationContext context) { null } PropertyDescriptor getPropertyDescriptor(String name) { null } void onPropertyModified(PropertyDescriptor descriptor, String oldValue, String newValue) { } List<PropertyDescriptor> getPropertyDescriptors() { Collections.unmodifiableList([BATCH_SIZE]) as List<PropertyDescriptor> } String getIdentifier() { null } JsonSlurper jsonSlurper = new JsonSlurper() JsonOutput jsonOutput = new JsonOutput() def reverseDnsLookup(String ipAddress) { try { InetAddress inetAddress = InetAddress.getByName(ipAddress) String hostName = inetAddress.getCanonicalHostName() return hostName } catch (UnknownHostException e) { return "Unknown" } } void onTrigger(ProcessContext context, ProcessSessionFactory sessionFactory) throws ProcessException { ProcessSession session = sessionFactory.createSession() try { List<FlowFile> flowFiles = session.get(context.getProperty(BATCH_SIZE).asInteger()) if (!flowFiles) return Map customAttributes = [:] flowFiles.each { flowFile -> String ipAddress = flowFile.getAttribute("ip_address") if (ipAddress) { String hostName = reverseDnsLookup(ipAddress) customAttributes["host_name"] = hostName flowFile = session.putAllAttributes(flowFile, customAttributes) } session.transfer(flowFile, REL_SUCCESS) } session.commit() } catch (final Throwable t) { log.error('{} failed to process due to {}; rolling back session', [this, t] as Object[]) session.rollback(true) throw t } } } processor = new GroovyProcessor()

joseomjr · ‎06-14-2023

Does ExecuteSQL erase some of the attributes that could be used to associate the FlowFiles futher down stream?

Online	Online
Last Visited	‎01-28-2026 08:32 AM

Name	Cy Jervis
Location	Lecanto, Fl
Member Since	‎04-06-2015 02:01 PM
Last Visited	‎01-28-2026 08:32 AM
Posts	2,213
Total Tags	1742
Kudos received	199

Cloudera Community

Re: Partner Developer License Request

Re: Where are Cloudera Blogs?

Re: Apache Nifi: how to ensure record order betwee...

Re: Nifi does not read data from the topic KAFKA o...

Re: Automate the deployment of a model

Re: 通过Livy提交sparksql任务居然报kafka kerberos错误，求助解答啊！

Re: HiveServer2 takes long time (25+min) to startu...

Re: Permission denied: user=root, access=WRITE, in...

Re: cloudera Manader add hosts error, How to deal ...

Re: Error: https://archive.cloudera.com/cm6/6.2.0/...

Re: how to load only new records from HIVE table

Re: Problem with Impala ODBC Parameter Values and ...

Re: rack awareness configuration on cloudera 6.3.x...

Re: NiFi Reverse DNS Lookup

Re: Fork and Join Flowfiles