Member since
01-19-2017
3676
Posts
632
Kudos Received
371
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
858 | 03-23-2025 05:23 AM | |
475 | 03-17-2025 10:18 AM | |
1567 | 03-05-2025 01:34 PM | |
1064 | 03-03-2025 01:09 PM | |
1268 | 03-02-2025 07:19 AM |
03-21-2025
12:16 PM
@PriyankaMondal Looking at your error message log, I can see you're experiencing authentication timeouts with ConsumeIMAP and ConsumePOP3 processors when connecting to Microsoft Office 365 services. Possible Blockers Timeout Issue The primary error is "Read timed out" during authentication, which suggests the connection to Office 365 is being established but then timing out during the OAUTH2 handshake. Microsoft 365 Specific Considerations Microsoft has specific requirements for modern authentication with mail services and has been deprecating basic authentication methods. Processor Configuration Using OAUTH2 authentication mode, is correct for Office 365, but there may be issues with the token acquisition or timeout settings. Possible solutions 1. Check timeout settings # Add these properties to your processor configuration mail.imap.connectiontimeout=60000 mail.imap.timeout=60000 mail.pop3.connectiontimeout=60000 .pop3.timeout=60000 2. Verify Modern Authentication settings Ensure the account has Modern Authentication enabled in Microsoft 365 Admin Center Verify the application registration in Azure AD has the correct permissions IMAP.AccessAsUser.All for IMAP POP.AccessAsUser.All for POP3 offline_access scope for refresh tokens 3. Update NiFi Mail Libraries NiFi's default JavaMail implementation might have compatibility issues with Office 365. Try: Updating to the latest version of NiFi (if possible) Or add Microsoft's MSAL (Microsoft Authentication Library) JAR to NiFi's lib directory 4. Use a Custom SSL Context Service Microsoft servers might require specific TLS settings # Create a Standard SSL Context Service with: Protocol: TLS # Add to Advanced Client settings for the processor 5. Alternative Approach: Use Microsoft Graph API Since Microsoft is moving away from direct IMAP/POP3 access, consider: Using InvokeHTTP processor to authenticate against Microsoft Graph API Use the Graph API endpoints to retrieve email content 6. Check Proxy Settings If your environment uses proxies # Add these properties mail.imap.proxy.host=your-proxy-host mail.imap.proxy.port=your-proxy-port mail.pop3.proxy.host=your-proxy-host mail.pop3.proxy.port=your-proxy-port 7. Implementation Steps Update the processor configuration with extended timeout values Verify OAuth2 settings in the processor match exactly with your Azure application registration Check Microsoft 365 account settings to ensure IMAP/POP3 is enabled with Modern Authentication Consider implementing a token debugging flow using InvokeHTTP to validate token acquisition separately Happy hadooping
... View more
03-18-2025
07:47 AM
The issue may be due to NiFi's PutDatabaseProcessor applying a local time zone conversion during data ingestion, causing the 5-hour shift. To fix this, ensure NiFi is explicitly set to handle UTC time zones for both reading and writing. Additionally, using ODBC for PostgreSQL could help by providing better control over time zone handling during the data transfer, ensuring consistency between MSSQL and PostgreSQL.
... View more
03-17-2025
10:18 AM
1 Kudo
@AllIsWell Somehow I feel you have some stale data. Before deleting the process group, fetch its current state using the API to confirm the correct version number curl -k -X GET "https://localhost:28443/nifi-api/process-groups/836e216e-0195-1000-d3b8-771b257f1fe6" \ -H "Authorization: Bearer Token" Look for the revision object in the response. The version field should match what you include in your DELETE request. Update the DELETE Request If the version in the response is not 0, update your DELETE request with the correct version. For example, if the current version is 5, your request should look like this: curl -k -X DELETE "https://localhost:28443/nifi-api/process-groups/836e216e-0195-1000-d3b8-771b257f1fe6" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer Token" \ --data '{ "revision": { "version": <Value from above> }, "disconnectedNodeAcknowledged": false }' Validate the JSON: Ensure that the JSON payload is valid. You can use tools like JSONLint to validate the structure. Check for Trailing Slashes: Ensure there are no trailing slashes in the URL. For example, use https://localhost:28443/nifi-api/process-groups/836e216e-0195-1000-d3b8-771b257f1fe6 instead of https://localhost:28443/nifi-api/process-groups/836e216e-0195-1000-d3b8-771b257f1fe6/. Disconnected Node Acknowledgment: If your NiFi cluster has disconnected nodes, you may need to set disconnectedNodeAcknowledged to true Final Notes: If the issue persists, double-check the API documentation for any changes or additional requirements. Ensure that the Authorization token is valid and has the necessary permissions to delete the process group. If you are using a NiFi version older than 1.12.0, the API behavior might differ slightly, so consult the documentation for your specific version. Happy hadooping
... View more
03-09-2025
05:25 AM
@zeeshanmcs It seem you're having an issue with unavailable tablets in your Kudu table, which is preventing Spark from inserting data. The output from kudu cluster ksck clearly shows the problem: the leader replicas for all tablets in the impala::mrs.NumberofSubs table are on a tablet server that's unavailable. The key issue is that the tablet server with ID 24483fcd36ce45d78d80beb04b3b0cf4 is not running, and it happens to be the leader for all 7 tablets in your table. Here's a solution to resolve this issue: 1. First, check the status of all Kudu tablet servers sudo systemctl status kudu-tserver 2. Look specifically for the tablet server with ID 24483fcd36ce45d78d80beb04b3b0cf4 sudo -u kudu kudu tserver list tstewputil1 3. If the tablet server is down, start it. sudo systemctl start kudu-tserver 4. If the tablet server is running but not responding, restart it sudo systemctl restart kudu-tserver 5. After restarting the tablet server, wait a few minutes for it to rejoin the cluster and for leadership transitions to occur, then check the status again sudo -u kudu kudu cluster ksck tstewputil1 If the tablet server is permanently lost or damaged, you'll need to recover the tablets a. Check if you have enough replicas (you should have at least 3 for production) sudo -u kudu kudu table describe impala::mrs.NumberofSubs tstewputil1 b. If you have other healthy replicas, you can delete the failed server from the cluster and Kudu will automatically recover sudo -u kudu kudu tserver delete tstewputil1 <tablet_server_uuid> c. If this is the only replica and you don't have backups, you may need to: Create a new table with the same schema Load data from your source systems Or restore from a backup if available If, after restarting, you still have issues, the problem might be: Disk space issues on the tablet server Configuration problems Network connectivity problems between servers Check the Kudu tablet server logs for more details less /var/log/kudu/kudu-tserver.log Once the tablet server is back online and healthy, your Spark job should be able to insert data into the table successfully Happy hadooping
... View more
03-07-2025
03:24 PM
@Maulz Connecting Python to Cloudera, Hive, and Hue involves using libraries and drivers that interface with HiveServer2 the service that allows remote clients to execute Hive queries.There are several methods to connect Python to Cloudera's ecosystem, particularly to access Hive tables through Hue. I'll detail the most common approaches. Prerequisites Cloudera/Hadoop Cluster: Ensure HiveServer2 is running on your cluster. Default HiveServer2 port: 10000 (verify via Cloudera Manager). Python Environment: Python 3.6+ installed. Authentication: Know your authentication method: Username/password (non-secure). Kerberos (common in enterprise clusters). LDAP. Below is a detailed, step-by-step guide: 2. Install Required Python Libraries Use pip to install: pip install pyhive # Python interface for Hive pip install thrift # Thrift protocol support pip install sasl # SASL authentication (for Kerberos) pip install thrift-sasl # SASL wrapper for Thrift pip install pykerberos # Kerberos support (if needed) For JDBC-based connections (alternative method): pip install JayDeBeApi # JDBC bridge 3. Configure Cloudera/Hive Via Cloudera Manager: Enable HiveServer2 and ensure it’s running. Check HiveServer2 Port (default: 10000). If using Kerberos: Ensure Kerberos is configured in Cloudera. Export your Kerberos keytab kinit -kt <keytab_file> <principal> Connecting Python to Cloudera/Hue/Hive 1.Using PyHive it's a Python library specifically designed to work with Hive from pyhive import hive # Connect to Hive server conn = hive.Connection( host='cloudera_host_name', port=10000, # Default HiveServer2 port username='your_username', password='your_password', database='default', # Your database name auth='LDAP' # Or 'NONE', 'KERBEROS', 'CUSTOM' depending on your authentication setup ) # Create a cursor cursor = conn.cursor() # Execute a query cursor.execute('SELECT * FROM your_table LIMIT 10') # Fetch results results = cursor.fetchall() print(results) # Close connections cursor.close() conn.close() 2. Using the Impala Connection If your Cloudera cluster uses Impala: from impala.dbapi import connect conn = connect( host='cloudera_host_name', port=21050, # Default Impala port user='your_username', password='your_password', database='default' # Your database name ) cursor = conn.cursor() cursor.execute('SELECT * FROM your_table LIMIT 10') results = cursor.fetchall() print(results) cursor.close() conn.close() 3. Integration with Hue Hue is a web UI for Hadoop, but you can programmatically interact with Hive via its APIs (limited). For direct Python-Hue integration: Use Hue’s REST API to execute queries: import requests # Hue API endpoint (replace with your Hue server URL) url = "http://<hue_server>:8888/hue/notebook/api/execute/hive" headers = {"Content-Type": "application/json"} data = { "script": "SELECT * FROM my_table", "dialect": "hive" } response = requests.post( url, auth=('<hue_username>', '<hue_password>'), headers=headers, json=data ) print(response.json()) Troubleshooting Common Issues: Connection Refused: Verify HiveServer2 is running (netstat -tuln | grep 10000). Check firewall rules. Authentication Failures: For Kerberos: Ensure kinit succeeded. For LDAP: Validate credentials. Thrift Version Mismatch: Use Thrift v0.13.0 with Hive 3.x. Logs: Check HiveServer2 logs in Cloudera Manager (/var/log/hive). 4. Best Practices Use connection pooling for high-frequency queries. For Kerberos, automate ticket renewal with kinit cron jobs. Secure credentials using environment variables or Vault. Happy hadooping
... View more
03-07-2025
06:36 AM
Hi @Shelton Thanks so much for detailed information @MattWho thanks much for the reply and apologies for short info. based on above information was able to create SSL certificates and generate Keystore and trustore in jks format . initially i was not configured CA file into truststore so faced some issue 2. then i did not added nifi nodes entries as intial identity in autherizers.xml file so above issue occured . i followed cloudera blogs where you had informed https://community.cloudera.com/t5/Support-Questions/insufficient-permissions-untrusted-proxy/m-p/366443#M239582 based on these i was able to resolve and 3 node cluster with external zookeeper was able to up. i appreciate your kind help and your time here . much thanks to both 🙂
... View more
03-04-2025
12:22 AM
Hi @Shelton Thanks a a Lot , i did troubleshooting steps as mentioned above and did restart of the all zookeeper and did some configuration changes like in nifi.properties file 1. nifi.zookeeper.connect.string=zookeepernode1.x.x.x.net:2181,zookeepernode2.x.x.x.net:2181,zookeepernode3.x.x.x.net:2181 2. nifi.zookeeper.ssl.client.auth=none 3. nifi.state.management.embedded.zookeeper.start=false in zookeeper nodes[1-3]: 1. dataDir=/var/lib/zookeeper from this location myid file was unable to fetch then we did restart of the VM post that it read it. 2. added server.1=zookeepernode1.x.x.x.net:2888:3888 server.2=zookeepernode2.x.x.x.net:2888:3888 server.3=zookeepernode3.x.x.x.net:2888:3888 since this is clustered setup this is must . Network configuration: 1. in Network manager enabled all ports like 2181 , 2888 and 3888 in all servers. with above changes made zookeeper to work and nifi is able to connect to external zookeeper cluster.
... View more
03-02-2025
07:19 AM
@drewski7 The error message says there's no EntityManager with an actual transaction available. That suggests that the code trying to persist the user isn't running within a transactional context. In Spring applications, methods that modify the database usually need to be annotated with `@Transactional` to ensure they run within a transaction. Looking at the stack trace, the error occurs in `XUserMgr$ExternalUserCreator.createExternalUser`, which calls `UserMgr.createUser`, which in turn uses `BaseDao.create`. The `create` method in `BaseDao` is trying to persist an entity but there's no active transaction. So maybe the `createUser` method or the code calling it isn't properly transactional. In version 2.4.0, this worked, so something must have changed in 2.5.0. Perhaps the upgrade introduced changes in how transactions are managed. Maybe a method that was previously transactional no longer is, or the transaction boundaries have shifted. Step 1: Verify Database Schema Compatibility Ranger 2.5.0 may require schema updates. Ensure the database schema is compatible with the new version: Check Upgrade Documentation: Review the Ranger 2.5.0 Release Notes for required schema changes. Example: If migrating from 2.4.0 to 2.5.0, you may need to run SQL scripts like x_portal_user_DDL.sql or apache-ranger-2.5.0-schema-upgrade.sql. Run Schema Upgrade Scripts: Locate the schema upgrade scripts in the Ranger installation directory (ranger-admin/db/mysql/patches) and apply them: mysql -u root -p ranger < apache-ranger-2.5.0-schema-upgrade.sql Validate the Schema: Confirm that the x_portal_user table exists and has the expected columns (e.g., login_id, user_role). Step 2: Check Transaction Management Configuration The error suggests a missing @Transactional annotation or misconfigured transaction manager in Ranger 2.5.0: Review Code/Configuration Changes: Compare the transaction management configurations between Ranger 2.4.0 and 2.5.0. Key files: ranger-admin/ews/webapp/WEB-INF/classes/conf/application.properties ranger-admin/ews/webapp/WEB-INF/classes/spring-beans.xml Apache Ranger JIRA: Search for issues like RANGER-XXXX related to transaction management in Ranger 2.5.0. Ensure Transactional Annotations: In Ranger 2.5.0, the method createUser in UserMgr.java or its caller must be annotated with @Transactional to ensure database operations run in a transaction. @Transactional public void createUser(...) { ... } 3. Debug Transaction Boundaries: Enable transaction logging in log4j.properties to trace transaction activity log4j.logger.org.springframework.transaction=DEBUG log4j.logger.org.springframework.orm.jpa=DEBUG Step 3: Manually Create the User (Temporary Workaround) If the user drew.nicolette is missing from x_portal_user, manually insert it into the database: INSERT INTO x_portal_user (login_id, password, user_role, status) VALUES ('drew.nicolette', 'LDAP_USER_PASSWORD_HASH_IF_APPLICABLE', 'ROLE_USER', 1); Note: This bypasses the transaction error but is not a permanent fix. Step 4: Verify LDAP Configuration Ensure LDAP settings in ranger-admin/ews/webapp/WEB-INF/classes/conf/ranger-admin-site.xml are correct for Ranger 2.5.0: <property>
<name>ranger.authentication.method</name>
<value>LDAP</value>
</property>
<property>
<name>ranger.ldap.url</name>
<value>ldap://your-ldap-server:389</value>
</property> Step 5: Check for Known Issues Apache Ranger JIRA: Search for issues like RANGER-XXXX related to transaction management in Ranger 2.5.0. 2. Apply Patches: If a patch exists (e.g., for missing @Transactional annotations), apply it to the Ranger 2.5.0 codebase. Step 6: Test with a New User Attempt to log in with a different LDAP user to see if the issue is specific to drew.nicolette or a systemic problem. If the error persists for all users, focus on transaction configuration or schema issues. If only drew.nicolette fails, check for conflicts in the x_portal_user table (e.g., duplicate entries). Final Checks Logs: Monitor ranger-admin.log and catalina.out for transaction-related errors after applying fixes. Permissions: Ensure the database user has write access to the x_portal_user table. Dependencies: Confirm that Spring and JPA library versions match Ranger 2.5.0 requirements. Happy hadooping
... View more
02-12-2025
01:42 AM
@0tto Please could you share your NiFi logs. Happy hadooping
... View more
02-04-2025
06:35 AM
Check beeline console output and HS2 logs to identify where it gets stuck and act accordingly.
... View more