About Azhar_Shaikh

anooptiwari24 · ‎01-20-2023

you can create custom nar file and then put into lib folder of $NIFI_HOME directory and restart your nifi server. Add dependecy in processor module & then write a java code then build and create your nar file. <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <version>4.1.2</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <version>4.1.2</version> </dependency> <dependency> <groupId>com.opencsv</groupId> <artifactId>opencsv</artifactId> <version>5.1</version> <exclusions> <exclusion> <artifactId>commons-logging</artifactId> <groupId>commons-logging</groupId> </exclusion> </exclusions> </dependency> /* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements. See the NOTICE file distributed with * this work for additional information regarding copyright ownership. * The ASF licenses this file to You under the Apache License, Version 2.0 * (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ package com.anoop.converter; import com.opencsv.CSVReader; import com.opencsv.exceptions.CsvValidationException; import org.apache.nifi.annotation.behavior.*; import org.apache.nifi.components.PropertyDescriptor; import org.apache.nifi.flowfile.FlowFile; import org.apache.nifi.annotation.lifecycle.OnScheduled; import org.apache.nifi.annotation.documentation.CapabilityDescription; import org.apache.nifi.annotation.documentation.SeeAlso; import org.apache.nifi.annotation.documentation.Tags; import org.apache.nifi.processor.AbstractProcessor; import org.apache.nifi.processor.ProcessContext; import org.apache.nifi.processor.ProcessSession; import org.apache.nifi.processor.ProcessorInitializationContext; import org.apache.nifi.processor.Relationship; import org.apache.nifi.processor.io.StreamCallback; import org.apache.nifi.processor.util.StandardValidators; import org.apache.poi.ss.usermodel.Cell; import org.apache.poi.ss.usermodel.Row; import org.apache.poi.xssf.usermodel.XSSFSheet; import org.apache.poi.xssf.usermodel.XSSFWorkbook; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.io.OutputStream; import java.util.ArrayList; import java.util.Collections; import java.util.HashSet; import java.util.List; import java.util.Set; @Tags({"csvToExcel"}) @CapabilityDescription("This processor can convert CSV flow files into Excel flow file") @SeeAlso({}) @ReadsAttributes({@ReadsAttribute(attribute="", description="")}) @WritesAttributes({@WritesAttribute(attribute="", description="")}) @InputRequirement(InputRequirement.Requirement.INPUT_REQUIRED) public class CsvToExcel extends AbstractProcessor { public static final Relationship REL_SUCCESS = new Relationship.Builder() .name("original") .description("The original file") .build(); private List<PropertyDescriptor> descriptors; private Set<Relationship> relationships; @Override protected void init(final ProcessorInitializationContext context) { descriptors = Collections.emptyList(); relationships = new HashSet<>(); relationships.add(REL_SUCCESS); relationships = Collections.unmodifiableSet(relationships); } @Override public Set<Relationship> getRelationships() { return this.relationships; } @Override public final List<PropertyDescriptor> getSupportedPropertyDescriptors() { return descriptors; } @OnScheduled public void onScheduled(final ProcessContext context) {} @Override public void onTrigger(final ProcessContext context, final ProcessSession session) { FlowFile flowFile = session.get(); if ( flowFile == null ) { return; } session.write(flowFile, new Converter()); session.putAttribute(flowFile,"convertedIntoExcel","true"); session.transfer(flowFile,REL_SUCCESS); } } class Converter implements StreamCallback { @Override public void process(InputStream in, OutputStream out) throws IOException { try { streamConversion(in,out); } catch (CsvValidationException e) { throw new RuntimeException(e); } } private void streamConversion(InputStream in, OutputStream out) throws IOException, CsvValidationException { CSVReader csvReader = new CSVReader(new InputStreamReader(in)); XSSFWorkbook workbook = new XSSFWorkbook(); XSSFSheet sheet = workbook.createSheet("Sheet1"); String[] rowData = null; int rowNum = 0; while ((rowData = csvReader.readNext()) != null) { Row row = sheet.createRow(rowNum++); int colNum = 0; for (String cellData : rowData) { Cell cell = row.createCell(colNum++); cell.setCellValue(cellData); } } workbook.write(out); workbook.close(); } }

Ninja · ‎10-31-2022

hahhaa sure

jacektrocinski · ‎05-17-2022

The linked thread is a walkthrough on how to secure a NiFi Registry instance locally. I’m looking for instructions on how to connect to a secure NiFi Registry deployed on CDP Data Hub. I’m running on AWS infrastructure. The Data Hub is deployed using default settings and resides in a private subnet.

Azhar_Shaikh · ‎04-13-2022

Hello Please refer to https://community.cloudera.com/t5/Community-Articles/Using-RStudio-as-an-Editor-with-ML-Runtimes/ta-p/325166 Was your question answered on cloudera community portal ? Make sure to mark the answer as the accepted solution. If you find a reply useful, say thanks by clicking on the thumbs up button.

shehbazk · ‎03-17-2022

Hello @Koffi The balancer will do the job for you, please refer to the below Official docs before configuring it. 1- Overview of the HDFS Balancer 2- Configuring the Balancer Was your question answered? Make sure to mark the answer as the accepted solution. If you find a reply useful, say thanks by clicking on the thumbs up button.

Azhar_Shaikh · ‎03-17-2022

Hello @Soa Hive partition divides the table into a number of partitions and these partitions can be further subdivided into more manageable parts known as Buckets or Clusters. The Bucketing concept is based on Hash function, which depends on the type of the bucketing column. Records which are bucketed by the same column will always be saved in the same bucket. The Bucketing concept is based on Hash function, which depends on the type of the bucketing column. Records which are bucketed by the same column will always be saved in the same bucket. Here, CLUSTERED BY clause is used to divide the table into buckets. each partition will be created as a directory. But in Hive Buckets, each bucket will be created as a file. Bucketing can also be done even without partitioning on Hive tables. Bucketed tables allow much more efficient sampling than the non-bucketed tables. Allowing queries on a section of data for testing and debugging purpose when the original data sets are very huge. Here, the user can fix the size of buckets according to the need. This concept also provides the flexibility to keep the records in each bucket to be sorted by one or more columns. Since the data files are equal sized parts, map-side joins will be faster on the bucketed tables. Was your question answered? Make sure to mark the answer as the accepted solution. If you find a reply useful, say thanks by clicking on the thumbs up button.

Griggsy · ‎03-15-2022

Hello @Azhar_Shaikh Thanks for the reply, as it turns out it wasn't a service account problem. We found that the ListS3's output included a 'key' field, and this is what was required in The FetchS3Object processor for 'Object Key'. So the fix I applied was to split the json into individual records (SplitJson), then pull the keys out as attributes (EvaluateJsonPath) then input ${key} into the FetchS3 processor.. worked a treat.

Azhar_Shaikh · ‎03-14-2022

@RajeshReddy for tag based policies you can refer to https://docs.cloudera.com/runtime/7.2.10/security-ranger-authorization/topics/security-ranger-tag-based-policies.html Was your question answered? Make sure to mark the answer as the accepted solution. If you find a reply useful, say thanks by clicking on the thumbs up button.

Azhar_Shaikh · ‎03-10-2022

@mehmetersoy CM does not have dependency on samba, and does not use any samba packages. Was your question answered? Make sure to mark the answer as the accepted solution. If you find a reply useful, say thanks by clicking on the thumbs up button.

Azhar_Shaikh · ‎03-10-2022

Hello @vishal_ Yes you are right. Machine users in CDP have programmatic access. If you have IDP integration with CDP you can create one user at Azure and add the user to the Azure AD group that is mapped with CDP and ask the user to login from Azure end to access the CDP application. If you are using CDP local users (users are directly created in CDP) you can reach out to your accounts team or open an administrative case from support portal to add the user to CDP and then you can manage access accordingly. I hope I have answered your question. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post.

Online	Offline
Last Visited	‎07-28-2025 02:26 AM

Member Since	‎07-21-2021 11:42 PM
Last Visited	‎07-28-2025 02:26 AM
Posts	628
Kudos received	4

Cloudera Community

Re: Connecting to NiFi Registry Locally

Re: Create custom processor to convert csv to exce...

Re: One Datanode has the status DEAD

Re: How bucketing helps in case of more than two t...

Re: How to grant user access to create tag based p...

Re: Create custom processor to convert csv to exce...

Re: Unable to login in Ambari tried all the passwo...

Re: Connecting to NiFi Registry Locally

Re: add workload user as host user

Re: One Datanode has the status DEAD

Re: How bucketing helps in case of more than two t...

Re: FetchS3Object - Failed to retrieve S3 object e...

Re: How to grant user access to create tag based p...

Re: Are samba packages necessary on Cloudera platf...

Re: CDP Public user creation