Member since
01-19-2017
3647
Posts
621
Kudos Received
364
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
150 | 12-22-2024 07:33 AM | |
98 | 12-18-2024 12:21 PM | |
384 | 12-17-2024 07:48 AM | |
290 | 08-02-2024 08:15 AM | |
3566 | 04-06-2023 12:49 PM |
12-18-2024
12:21 PM
1 Kudo
@allen_chu Maybe I didn't understand the question well but here are the differences and explanation to help you understand and configure the 2 options correctly 1. Difference Between spark_shuffle and spark2_shuffle spark_shuffle Used for Apache Spark 1.x versions. Refers to the shuffle service for older Spark releases that rely on the original shuffle mechanism. Declared in YARN Node Manager's configuration (yarn-site.xml): <property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle,spark_shuffle</value>
</property> spark2_shuffle Introduced for Apache Spark 2.x and later versions. Handles shuffle operations for newer Spark versions, which have an updated shuffle mechanism with better performance and scalability. Declared similarly in yarn-site.xml: <property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle,spark2_shuffle</value>
</property> 2. Why Two Shuffle Services? Backward Compatibility: spark_shuffle is retained for Spark 1.x jobs to continue running without modifications. Separate Service: spark2_shuffle ensures that jobs running on Spark 2.x+ use an optimized and compatible shuffle service without interfering with Spark 1.x jobs. Upgrade Path: In clusters supporting multiple Spark versions, both shuffle services may coexist to support jobs submitted using Spark 1.x and Spark 2.x simultaneously. 3. Configuration in YARN To enable the shuffle service for both versions, configure the NodeManager to start both services: <property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle,spark_shuffle,spark2_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
<value>org.apache.spark.network.yarn.YarnShuffleService</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.spark2_shuffle.class</name>
<value>org.apache.spark.network.yarn.YarnShuffleService</value>
</property> 4. Key Points Use spark_shuffle for jobs running with Spark 1.x. Use spark2_shuffle for jobs running with Spark 2.x or later. In modern setups, spark2_shuffle is the primary shuffle service since Spark 1.x is largely deprecated. Happy hadooping
... View more
12-18-2024
08:50 AM
@drewski7 I have just picked your ticket I hope I can help you resolve this issue if its still unresolved. There are are couple of configurations changes and implementations that have to done. 1. Overview OAuth allows Kafka clients to obtain access tokens from an external authentication provider like OAuth providers to authenticate with the Kafka broker. This process involves configuring the Kafka broker, OAuth provider, and Kafka clients. 2. Prerequisites Kafka cluster with SASL/OAUTHBEARER enabled. An OAuth provider set up to issue access tokens. Kafka clients that support SASL/OAUTHBEARER. Required libraries for OAuth integration (e.g. kafka-clients, oauth2-client, or keycloak adapters). 3. Procedure Step 1: Configure the OAuth Provider Set up an OAuth provider (e.g., Keycloak, Okta, etc.) to act as the identity provider (IdP). Register a new client application for Kafka in the OAuth provider: Set up client ID and client secret for Kafka clients. Configure scopes, roles, or claims required for authorization. Enable grant types like Client Credentials or Password (depending on your use case). Note down the following details: Authorization Server URL (e.g.https://authlogin.northwind.com/token). Client ID and Client Secret. Step 2: Configure the Kafka Broker Enable SASL/OAUTHBEARER Authentication: Edit the Kafka broker configuration (/config/server.properties) sasl.enabled.mechanisms=OAUTHBEARER listener.name.<listener-name>.oauthbearer.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \ oauth.token.endpoint.uri="https://auth.example.com/token" \ oauth.client.id="kafka-broker-client-id" \ oauth.client.secret="kafka-broker-client-secret" \ oauth.scope="kafka-scope"; Replace <listener-name> with (SASL_PLAINTEXT, SASL_SSL) as appropriate. Configure ACLs (Optional): If using authorization, configure ACLs to grant specific permissions to authenticated users. Restart the Kafka Broker: Restart the Kafka broker to apply the changes sudo systemctl restart kafka Step 3: Configure the Kafka Client Add required dependencies to your Kafka client application: For Java applications, add the Kafka and OAuth dependencies to your pom.xml or build.gradle. pom.xml example <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>3.0.0</version> </dependency> <dependency> <groupId>com.nimbusds</groupId> <artifactId>oauth2-oidc-sdk</artifactId> <version>9.4</version> </dependency> 2. Configure OAuth in the Kafka Client: Specify the SASL mechanism and the OAuth token endpoint in the client configuration Properties props = new Properties(); props.put("bootstrap.servers", "broker1:9092,broker2:9092"); props.put("security.protocol", "SASL_SSL"); props.put("sasl.mechanism", "OAUTHBEARER"); props.put("sasl.jaas.config", "org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required " + "oauth.token.endpoint.uri=\"https://auth.example.com/token\" " + "oauth.client.id=\"kafka-client-id\" " + "oauth.client.secret=\"kafka-client-secret\";"); 3. Implement Token Retrieval (Optional): Use an external tool or library to retrieve and manage tokens if you need a custom implementation. curl -X POST -d "grant_type=client_credentials&client_id=kafka-client-id&client_secret=kafka-client-secret" \ https://auth.example.com/token 4. Create the Kafka Producer/Consumer: Use the above configuration to initialize a Kafka producer or consumer KafkaProducer<String, String> producer = new KafkaProducer<>(props); Step 4: Test the Authentication Produce and consume messages to verify OAuth-based authentication: kafka-console-producer.sh --broker-list <broker-address> --topic <topic-name> --producer.config <client-config> kafka-console-consumer.sh --bootstrap-server <broker-address> --topic <topic-name> --consumer.config <client-config> Ensure logs indicate successful authentication using SASL/OAUTHBEARER. Step 5: Monitor and Debug Check Kafka broker logs for errors related to OAuth authentication. Verify token expiration and renewal mechanisms. Ensure the OAuth provider is reachable from the Kafka brokers and clients. Happy Hadooping I hope the above steps helps in the diagnosis and resolution of you Kafka OAuth issue
... View more
12-17-2024
12:41 PM
1 Kudo
@JSSSS The error is this "java.io.IOException: File /user/JS/input/DIC.txt._COPYING_ could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation." All the 3 datanode according to the log are excludeNodes=[192.168.1.81:9866, 192.168.1.125:9866, 192.168.1.8> with replication factor of 3 , writes should succeed to all the 3 datanodes else the write fails. The cluster may have under-replicated or unavailable blocks due to excluded nodes HDFS cannot use these nodes, possibly due to: Disk space issues. Write errors or disk failures. Network connectivity problems between the NameNode and DataNodes. 1. Verify if the DataNodes are live and connected to the NameNode hdfs dfsadmin -report Look for the "Live nodes" and "Dead nodes" section If all 3 DataNodes are excluded, they might show up as dead or decommissioned. Ensure the DataNodes have sufficient disk space for the write operation df -h Look at the HDFS data directories (/hadoop/hdfs/data) If disk space is full, clear unnecessary files or increase disk capacity hdfs dfs -rm -r /path/to/old/unused/files View the list of excluded nodes cat $HADOOP_HOME/etc/hadoop/datanodes.exclude If nodes are wrongly excluded: Remove their entries from datanodes.exclude. Refresh the NameNode to apply changes hdfs dfsadmin -refreshNodes Block Placement Policy: If the cluster has DataNodes with specific restrictions (e.g., rack awareness), verify the block placement policy grep dfs.block.replicator.classname $HADOOP_HOME/etc/hadoop/hdfs-site.xml Default: org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault Happy hadooping
... View more
12-17-2024
11:53 AM
@Sid17 Can you try this JOLT [ { "operation": "shift", "spec": { "data": { "getCarListing": { "edges": { "*": { "node": { "carClusterIds": { "*": { "element": { "name": "[].element" }, "businessRelations": { "*": { "countries": { "*": { "countryCode": "[].businessRelations[].countryCode" } } } } } } } } } } } } }, { "operation": "cardinality", "spec": { "[]": "ONE" } } ] Hope it works
... View more
12-17-2024
10:31 AM
@denysobukhov If this issue hasn't been resolved I am suspecting the HS2 idle Timeout and Thread pool size. Can you please do the below and share the out come. 1. Address Server-Side Resource or Timeout Issues Increase HiveServer2 Idle Timeout By default, HiveServer2 may close idle connections after a certain period. Increase this timeout: Update the HiveServer2 config: hive.server2.idle.session.timeout (default: 600000 ms / 10 minutes). Set it to a larger value, e.g., 3600000 (1 hour). hive.server2.idle.operation.timeout (default: 5 minutes for operations). Increase to match your app's use case. SET hive.server2.idle.session.timeout=3600000; SET hive.server2.idle.operation.timeout=3600000; Adjust Thread Pool Size If HiveServer2 runs out of threads to handle requests, it can drop connections: Increase hive.server2.threads to a higher value in HiveServer2 configurations. Restart HiveServer2 after changes. First check the default hive.server2.thrift.max.worker.threads jstack -l <HiveServere2_ProccessId> | grep ".Thread.Stat" | wc -l Happy hadooping
... View more
12-17-2024
10:14 AM
@Viki_Nodejs if you haven't resolved this issue could you try the below steps and revert. 1. Install the Required NPM Packages Use the hive-driver package for Node.js, which supports HiveServer2 over HTTP/HTTPS. npm install hive-driver 2. Prerequisites Ensure you have: HiveServer2 URL: Includes the hostname and port. SSL Configuration: Paths to your .jks trust store and its password. Hive httppath: Set to cliservice. Authentication details (if required): Username/password or Kerberos configuration. 3. Configure the Connection Here's an example of how to set up the connection using the hive-driver: const { HiveClient, TCLIServiceTypes } = require('hive-driver'); async function connectToHive() { const client = new HiveClient(TCLIServiceTypes); // Configure the Hive connection const connection = client.connect({ host: '<HIVE_SERVER_HOSTNAME>', // e.g., hive.example.com port: 10001, // HiveServer2 port, typically 10001 for HTTPS options: { path: '/cliservice', // HTTP path to HiveServer2 ssl: true, // Enable SSL sslOptions: { rejectUnauthorized: true, // Ensure certificates are verified ca: '<path/to/truststore.pem>' // Convert your JKS truststore to PEM format }, // Authentication username: '<YOUR_USERNAME>', password: '<YOUR_PASSWORD>', // You can add session configurations here } }); try { // Open the connection await connection.openSession(); console.log('Connected to Hive'); // Example query const result = await connection.executeStatement('SELECT * FROM your_table LIMIT 10'); console.log(result); // Close the session await connection.closeSession(); } catch (error) { console.error('Error connecting to Hive:', error); } finally { // Ensure the connection is closed await connection.close(); } } connectToHive(); 4. Key Point to Note !!!!!!!!! SSL Truststore [Very Important] Hive uses .jks files for its truststore, but hive-driver requires a .pem file for SSL. Convert your .jks file to .pem using the following commands: keytool -importkeystore -srckeystore truststore.jks -destkeystore truststore.p12 -deststoretype PKCS12 openssl pkcs12 -in truststore.p12 -out truststore.pem -nokeys I also saw an EAI_FAIL error in the screenshot this is related to not being able to resolve the DNS. Hope this helps
... View more
12-17-2024
07:48 AM
@tono425 The error messages you are encountering in NiFi are related to Java's native method access restrictions introduced in newer Java versions (likely Java 17 or higher). These warnings indicate that NiFi (or a dependency like Apache Lucene) is calling restricted native methods that require explicit permission to access low-level operating system functions. The warning mentions java.lang.foreign.Linker::downcallHandle, which is a native method used for low-level interactions with the operating system. Here are the 3 options you should try to resolve the issue: Option 1: Add the Java Option to Enable Native Access Update the NiFi startup configuration to allow unrestricted native access for unnamed modules. Edit the NiFi Java options in the bootstrap.conf file: nano /opt/nifi/conf/bootstrap.conf 2. Add the following option to the java.arg properties: java.arg.X=-enable-native-access=ALL-UNNAMED 3. Restart NiFi: sudo systemctl restart nifi Option 2: Use a Lower Java Version (if possible) If NiFi was previously running fine with an earlier Java version (e.g., Java 8 or 11), you can revert to it until you're ready to address the native access changes: Update the JAVA_HOME path in NiFi's bootstrap.conf to point to the older Java version. Option 3: Verify Dependencies Check if you are using the latest versions of NiFi and Apache Lucene: Some of these warnings may be addressed in recent releases of NiFi or its libraries. Consider upgrading NiFi to the latest stable version to ensure compatibility with newer Java versions. Happy hadooping
... View more
12-16-2024
02:19 PM
1 Kudo
@sayebogbon Upgrading Cloudera Manager or CDP can sometimes alter TLS/SSL settings. Please can you verify if TLS/SSL is enabled for the affected services: Navigate to Cloudera Manager > Administration > Security. Confirm that the keystores and truststores for SSL are correctly configured. Validate the keystore and truststore paths in the HDFS configuration: dfs.http.policy dfs.https.enable Please do the above and revert happy hadooping !!!!!
... View more
12-16-2024
02:10 PM
1 Kudo
@divyank Have you resolved this issue if not the issue you're encountering is common when Kerberos is enabled for HDFS, as it introduces authentication requirements that need to be properly configured. Here’s how to diagnose and resolve the problem: 1. Root Cause Analysis When Kerberos is enabled: Authentication: Every interaction with HDFS now requires a Kerberos ticket. Misconfiguration: The HDFS service or client-side configurations may not be aligned with Kerberos requirements. Keytabs: Missing or improperly configured keytab files for the HDFS service or users accessing the service. Browser Access: The HDFS Web UI may not support unauthenticated access unless explicitly configured. 2. Steps to Resolve Step 1: Verify Kerberos Configuration Check the Kerberos principal and keytab file paths for HDFS in Cloudera Manager: Navigate to HDFS Service > Configuration. Look for settings like: hadoop.security.authentication → Should be set to Kerberos. dfs.namenode.kerberos.principal → Should match the principal defined in the KDC. dfs.namenode.keytab.file → Ensure the file exists on the NameNode and has correct permissions. Step 2: Validate Kerberos Ticket Check if the HDFS service has a valid Kerberos ticket: klist -kte /path/to/hdfs.keytab If missing, reinitialize the ticket: kinit -kt /path/to/hdfs.keytab hdfs/<hostname>@<REALM> Test HDFS access from the command line: hdfs dfs -ls / If you get authentication errors, the Kerberos ticket might be invalid. Step 3: Validate HDFS Web UI Access Post-Kerberos, accessing the HDFS Web UI (e.g., http://namenode-host:50070) often requires authentication. By default: Unauthenticated Access: May be blocked. Browser Integration: Ensure your browser is configured for Kerberos authentication or the UI is set to allow unauthenticated users. Enable unauthenticated access in Cloudera Manager (if needed): Go to HDFS Service > Configuration. Search for hadoop.http.authentication.type and set it to simple. Step 4: Review Logs for Errors Check NameNode logs for Kerberos-related errors: less /var/log/hadoop/hdfs/hadoop-hdfs-namenode.log Look for errors like: "GSSException: No valid credentials provided" "Principal not found in the keytab" Step 5: Synchronize Clocks Kerberos is sensitive to time discrepancies. Ensure all nodes in the cluster have synchronized clocks ntpdate <NTP-server> Step 6: Restart Services Restart the affected HDFS services via Cloudera Manager after making changes: Restart NameNode, DataNode, and HDFS services. Test the status of HDFS hdfs dfsadmin -report 3. Confirm Resolution Verify HDFS functionality: Test browsing HDFS via the CLI: hdfs dfs -ls / Access the Web UI to confirm functionality: http://<namenode-host>:50070 If HDFS is working via CLI but not in the Web UI, revisit the Web UI settings in Cloudera Manager to allow browser access or configure browser Kerberos support. 4. Troubleshooting Tips If the issue persists: Check the Kerberos ticket validity with: klist Use the following commands to troubleshoot connectivity: hdfs dfs -mkdir /test hdfs dfs -put <local-file> /test Let me know how it goes or if further guidance is needed!
... View more
12-16-2024
01:50 PM
1 Kudo
@rizalt Can you share your layout of the 18 hosts to better understand where the issue could be emanating from? The issue you are experiencing, where shutting down 8 DataNodes causes both NameNodes in your high availability (HA) configuration to go down, likely points to Quorum loss in the JournalNodes or insufficient replicas for critical metadata blocks. The NameNodes in HA mode rely on JournalNodes for shared edits. For the HA setup to function correctly, the JournalNodes need a quorum (more than half) to be available. With 5 JournalNodes, at least 3 must be operational. If shutting down 8 DataNodes impacted the connectivity or availability of more than 2 JournalNodes, the quorum would be lost, causing both NameNodes to stop functioning. If shutting down 8 DataNodes reduces the number of replicas below the replication factor (typically 3), the metadata might not be available, causing the NameNodes to fail. Please revert
... View more