About Shelton

Sid17 · ‎12-20-2024

Thank you @SAMSAL (JOLT expert), The spec you posted worked properly and I'm able to see the flat JSON as expected.

sayebogbon · ‎12-20-2024

The TLS/SSL are configured correctly. Also, the krb5.conf is the same for pre-prod which is working fine. I think the problem partly has to do with our Python3.8 installation. We did the installation via Anaconda. Cloudera recommended will use yum to install the rh-python38 on our RHEL/OL7. Documentation is here: Installing Python 3.8 standard package on RHEL 7 | CDP Private Cloud Also, regarding the Impala, this Cloudera documentation was quite helpful: Configuring Impala Web UI | CDP Public Cloud The issue is resolved now by following the instructions in the above documentation.

Velankanni · ‎12-19-2024

Hi @SAMSAL , This works really fine. Thank you so much for your solution. I got your idea of splitting the JSON and perform the transformation in SQL table. I will work on that. Thank you again.

allen_chu · ‎12-19-2024

@Shelton Thank you for your reply. This information is very helpful.

rizalt · ‎12-19-2024

Please Help me @Shelton , what is the maximum datanode failure percentage? I tried to install 11 JN, and 11 ZK, but it didn't work, out of 16 nodes only 7 datanodes can fail active or dead, I need 8 datanodes dead but HA still running

Shelton · ‎12-18-2024

@drewski7 I have just picked your ticket I hope I can help you resolve this issue if its still unresolved. There are are couple of configurations changes and implementations that have to done. 1. Overview OAuth allows Kafka clients to obtain access tokens from an external authentication provider like OAuth providers to authenticate with the Kafka broker. This process involves configuring the Kafka broker, OAuth provider, and Kafka clients. 2. Prerequisites Kafka cluster with SASL/OAUTHBEARER enabled. An OAuth provider set up to issue access tokens. Kafka clients that support SASL/OAUTHBEARER. Required libraries for OAuth integration (e.g. kafka-clients, oauth2-client, or keycloak adapters). 3. Procedure Step 1: Configure the OAuth Provider Set up an OAuth provider (e.g., Keycloak, Okta, etc.) to act as the identity provider (IdP). Register a new client application for Kafka in the OAuth provider: Set up client ID and client secret for Kafka clients. Configure scopes, roles, or claims required for authorization. Enable grant types like Client Credentials or Password (depending on your use case). Note down the following details: Authorization Server URL (e.g.https://authlogin.northwind.com/token). Client ID and Client Secret. Step 2: Configure the Kafka Broker Enable SASL/OAUTHBEARER Authentication: Edit the Kafka broker configuration (/config/server.properties) sasl.enabled.mechanisms=OAUTHBEARER listener.name.<listener-name>.oauthbearer.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \ oauth.token.endpoint.uri="https://auth.example.com/token" \ oauth.client.id="kafka-broker-client-id" \ oauth.client.secret="kafka-broker-client-secret" \ oauth.scope="kafka-scope"; Replace <listener-name> with (SASL_PLAINTEXT, SASL_SSL) as appropriate. Configure ACLs (Optional): If using authorization, configure ACLs to grant specific permissions to authenticated users. Restart the Kafka Broker: Restart the Kafka broker to apply the changes sudo systemctl restart kafka Step 3: Configure the Kafka Client Add required dependencies to your Kafka client application: For Java applications, add the Kafka and OAuth dependencies to your pom.xml or build.gradle. pom.xml example <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>3.0.0</version> </dependency> <dependency> <groupId>com.nimbusds</groupId> <artifactId>oauth2-oidc-sdk</artifactId> <version>9.4</version> </dependency> 2. Configure OAuth in the Kafka Client: Specify the SASL mechanism and the OAuth token endpoint in the client configuration Properties props = new Properties(); props.put("bootstrap.servers", "broker1:9092,broker2:9092"); props.put("security.protocol", "SASL_SSL"); props.put("sasl.mechanism", "OAUTHBEARER"); props.put("sasl.jaas.config", "org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required " + "oauth.token.endpoint.uri=\"https://auth.example.com/token\" " + "oauth.client.id=\"kafka-client-id\" " + "oauth.client.secret=\"kafka-client-secret\";"); 3. Implement Token Retrieval (Optional): Use an external tool or library to retrieve and manage tokens if you need a custom implementation. curl -X POST -d "grant_type=client_credentials&client_id=kafka-client-id&client_secret=kafka-client-secret" \ https://auth.example.com/token 4. Create the Kafka Producer/Consumer: Use the above configuration to initialize a Kafka producer or consumer KafkaProducer<String, String> producer = new KafkaProducer<>(props); Step 4: Test the Authentication Produce and consume messages to verify OAuth-based authentication: kafka-console-producer.sh --broker-list <broker-address> --topic <topic-name> --producer.config <client-config> kafka-console-consumer.sh --bootstrap-server <broker-address> --topic <topic-name> --consumer.config <client-config> Ensure logs indicate successful authentication using SASL/OAUTHBEARER. Step 5: Monitor and Debug Check Kafka broker logs for errors related to OAuth authentication. Verify token expiration and renewal mechanisms. Ensure the OAuth provider is reachable from the Kafka brokers and clients. Happy Hadooping I hope the above steps helps in the diagnosis and resolution of you Kafka OAuth issue

tono425 · ‎12-17-2024

@Shelton Thank you for your advice. As I use the latest version of NiFi and it requires Java 21, I added the following line in bootstrap.conf and confirmed the warning messages disappeared. java.arg.EnableNativeAccess=--enable-native-access=ALL-UNNAMED I appreciate your help. Thank you,

Shelton · ‎12-17-2024

@JSSSS The error is this "java.io.IOException: File /user/JS/input/DIC.txt._COPYING_ could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation." All the 3 datanode according to the log are excludeNodes=[192.168.1.81:9866, 192.168.1.125:9866, 192.168.1.8> with replication factor of 3 , writes should succeed to all the 3 datanodes else the write fails. The cluster may have under-replicated or unavailable blocks due to excluded nodes HDFS cannot use these nodes, possibly due to: Disk space issues. Write errors or disk failures. Network connectivity problems between the NameNode and DataNodes. 1. Verify if the DataNodes are live and connected to the NameNode hdfs dfsadmin -report Look for the "Live nodes" and "Dead nodes" section If all 3 DataNodes are excluded, they might show up as dead or decommissioned. Ensure the DataNodes have sufficient disk space for the write operation df -h Look at the HDFS data directories (/hadoop/hdfs/data) If disk space is full, clear unnecessary files or increase disk capacity hdfs dfs -rm -r /path/to/old/unused/files View the list of excluded nodes cat $HADOOP_HOME/etc/hadoop/datanodes.exclude If nodes are wrongly excluded: Remove their entries from datanodes.exclude. Refresh the NameNode to apply changes hdfs dfsadmin -refreshNodes Block Placement Policy: If the cluster has DataNodes with specific restrictions (e.g., rack awareness), verify the block placement policy grep dfs.block.replicator.classname $HADOOP_HOME/etc/hadoop/hdfs-site.xml Default: org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault Happy hadooping

Shelton · ‎12-17-2024

@denysobukhov If this issue hasn't been resolved I am suspecting the HS2 idle Timeout and Thread pool size. Can you please do the below and share the out come. 1. Address Server-Side Resource or Timeout Issues Increase HiveServer2 Idle Timeout By default, HiveServer2 may close idle connections after a certain period. Increase this timeout: Update the HiveServer2 config: hive.server2.idle.session.timeout (default: 600000 ms / 10 minutes). Set it to a larger value, e.g., 3600000 (1 hour). hive.server2.idle.operation.timeout (default: 5 minutes for operations). Increase to match your app's use case. SET hive.server2.idle.session.timeout=3600000; SET hive.server2.idle.operation.timeout=3600000; Adjust Thread Pool Size If HiveServer2 runs out of threads to handle requests, it can drop connections: Increase hive.server2.threads to a higher value in HiveServer2 configurations. Restart HiveServer2 after changes. First check the default hive.server2.thrift.max.worker.threads jstack -l <HiveServere2_ProccessId> | grep ".Thread.Stat" | wc -l Happy hadooping

Shelton · ‎12-17-2024

@Viki_Nodejs if you haven't resolved this issue could you try the below steps and revert. 1. Install the Required NPM Packages Use the hive-driver package for Node.js, which supports HiveServer2 over HTTP/HTTPS. npm install hive-driver 2. Prerequisites Ensure you have: HiveServer2 URL: Includes the hostname and port. SSL Configuration: Paths to your .jks trust store and its password. Hive httppath: Set to cliservice. Authentication details (if required): Username/password or Kerberos configuration. 3. Configure the Connection Here's an example of how to set up the connection using the hive-driver: const { HiveClient, TCLIServiceTypes } = require('hive-driver'); async function connectToHive() { const client = new HiveClient(TCLIServiceTypes); // Configure the Hive connection const connection = client.connect({ host: '<HIVE_SERVER_HOSTNAME>', // e.g., hive.example.com port: 10001, // HiveServer2 port, typically 10001 for HTTPS options: { path: '/cliservice', // HTTP path to HiveServer2 ssl: true, // Enable SSL sslOptions: { rejectUnauthorized: true, // Ensure certificates are verified ca: '<path/to/truststore.pem>' // Convert your JKS truststore to PEM format }, // Authentication username: '<YOUR_USERNAME>', password: '<YOUR_PASSWORD>', // You can add session configurations here } }); try { // Open the connection await connection.openSession(); console.log('Connected to Hive'); // Example query const result = await connection.executeStatement('SELECT * FROM your_table LIMIT 10'); console.log(result); // Close the session await connection.closeSession(); } catch (error) { console.error('Error connecting to Hive:', error); } finally { // Ensure the connection is closed await connection.close(); } } connectToHive(); 4. Key Point to Note !!!!!!!!! SSL Truststore [Very Important] Hive uses .jks files for its truststore, but hive-driver requires a .pem file for SSL. Convert your .jks file to .pem using the following commands: keytool -importkeystore -srckeystore truststore.jks -destkeystore truststore.p12 -deststoretype PKCS12 openssl pkcs12 -in truststore.p12 -out truststore.pem -nokeys I also saw an EAI_FAIL error in the screenshot this is related to not being able to resolve the DNS. Hope this helps

Online	Offline
Last Visited	‎01-04-2025 05:55 AM

Member Since	‎01-19-2017 04:35 AM
Last Visited	‎01-04-2025 05:55 AM
Posts	3,651
Kudos received	618

Cloudera Community

Re: NiFi ERR_CONNECTION_REFUSED from other compute...

Re: What is the difference between spark_shuffle &...

Re: Getting java warning after staring NiFi

Re: When enabling Kerberos using the wizard, an er...

Re: ambari metrics collector not starting on a nod...

Re: Converting Nested JSON to Flat JSON using JOLT

Re: The Cloudera Manager Agent is not able to comm...

Re: Jolt spec to flatten the nested JSON

Re: What is the difference between spark_shuffle &...

Re: [HDFS]Can JournalNode handling more 50% datano...

Re: NiFi - Support SASL/OAUTHBEARER in Kafka proce...

Re: Getting java warning after staring NiFi

Re: Urgen But I cannot seem up Uplaod a file into ...

Re: [Cloudera][HiveJDBCDriver](500593) Communicati...

Re: Couldn't establish Hive DB session in nodejs