Member since
10-11-2022
108
Posts
12
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
59 | 05-15-2024 01:32 AM | |
48 | 05-12-2024 01:41 AM | |
50 | 05-12-2024 01:37 AM | |
91 | 05-10-2024 07:31 AM | |
106 | 04-29-2024 10:48 PM |
05-18-2024
12:02 AM
1 Kudo
@jpconver2 Challenges: NiFi Version: While recent versions (1.10.0+) offer improved cluster management, rolling updates can still be challenging if your custom processors introduce flow configuration changes. Nodes with the old processors won't recognize components from the updated NAR, preventing them from joining the cluster until all nodes are in sync. Flow Compatibility: NiFi requires consistent flow definitions (flow.xml.gz) across all nodes. Updates that alter the flow can disrupt cluster operations during rolling updates. Solutions: Scenario a: Single NAR Version Backward Compatibility: Prioritize backward compatibility in your custom processors. This ensures minimal changes to the flow definition and smoother rolling updates. Full Cluster Upgrade: If backward compatibility isn't feasible, consider a full cluster upgrade to the new NiFi version and custom processor NAR. Scenario b: Multiple NAR Versions Manual Version Management: Update processors manually through the NiFi UI or API after deploying the new NARs. This offers control but requires intervention. Custom Automation Scripts: Develop scripts leveraging NiFi's REST API to automate processor version updates. These scripts can: Identify custom processor instances. Update each processor to the latest available version. Update controller services and restart affected processors. Custom NiFi Extensions: Implement custom logic to handle version upgrades. This could involve creating a Reporting Task or Controller Service that checks for new versions and updates processors automatically. Recommendations: Upgrade NiFi Version: If possible, upgrade to NiFi 1.10.0 or later for improved rolling update support. Scripting for Automation: Explore scripting with the NiFi REST API to automate processor version updates, especially if you manage multiple NAR versions. Remember: Stay updated with the latest NiFi releases to benefit from improvements and features. Carefully evaluate your specific needs and choose the approach that balances downtime and manageability. Please accept it as a solution if it it helps
... View more
05-15-2024
01:32 AM
2 Kudos
@galt Altering the ID of a connection in Apache NiFi isn't directly endorsed or recommended because the ID serves as a unique identifier used internally by NiFi to manage its components. However, if you absolutely must change the ID for a specific reason, you could employ a workaround, though it's not advisable due to potential risks and complications. Here's a basic approach you could consider: Backup: Before making any alterations, make sure to create a backup of your NiFi flow. This step is crucial in case something goes awry and you need to revert to the previous state. Export and Modify Flow Configuration: Export the NiFi flow configuration, typically in XML format. This can be done via the NiFi UI or by utilizing NiFi's REST API. Then, manually adjust the XML to change the ID of the connection to the desired value. Stop NiFi: Halt the NiFi instance to prevent conflicts or corruption while modifying the configuration files. Replace Configuration: Substitute the existing flow configuration file with the modified one. Restart NiFi: Restart NiFi and confirm that the changes have been implemented. Keep in mind the following considerations: Risks: Altering the ID directly in the configuration files could result in unexpected behavior or even corruption of your flow. Proceed with caution and ensure you have a backup. Dependency: If any processors or components rely on this connection ID within NiFi, they may break or exhibit unexpected behavior after the change. Unsupported: This method isn't officially supported by Apache NiFi, and there's no guarantee that it will function seamlessly or without issues.
... View more
05-12-2024
01:41 AM
1 Kudo
@ChineduLB WITH data_counts AS ( SELECT COUNT(*) AS count_table1, COUNT(*) AS count_table2, COUNT(*) AS count_table3, COUNT(*) AS count_table4, COUNT(*) AS count_table5, COUNT(*) AS count_table6 FROM table1 WHERE date_partition = 'your_date' -- Replace 'your_date' with the specific date you're interested in UNION ALL SELECT COUNT(*), COUNT(*), COUNT(*), COUNT(*), COUNT(*), COUNT(*) FROM table2 WHERE date_partition = 'your_date' UNION ALL SELECT COUNT(*), COUNT(*), COUNT(*), COUNT(*), COUNT(*), COUNT(*) FROM table3 WHERE date_partition = 'your_date' UNION ALL SELECT COUNT(*), COUNT(*), COUNT(*), COUNT(*), COUNT(*), COUNT(*) FROM table4 WHERE date_partition = 'your_date' UNION ALL SELECT COUNT(*), COUNT(*), COUNT(*), COUNT(*), COUNT(*), COUNT(*) FROM table5 WHERE date_partition = 'your_date' UNION ALL SELECT COUNT(*), COUNT(*), COUNT(*), COUNT(*), COUNT(*), COUNT(*) FROM table6 WHERE date_partition = 'your_date' ) SELECT CASE WHEN SUM(count_table1) > 0 AND SUM(count_table2) > 0 AND SUM(count_table3) > 0 AND SUM(count_table4) > 0 AND SUM(count_table5) > 0 AND SUM(count_table6) > 0 THEN (SELECT * FROM table1 WHERE date_partition = 'your_date') ELSE NULL -- or whatever you want to return if data doesn't exist in all tables END AS result FROM data_counts;
... View more
05-12-2024
01:37 AM
@ChineduLB Impala doesn't directly support nested select statements within the WHEN clause of a CASE statement. However, you can achieve similar logic Subqueries for conditions: You can use subqueries within the WHEN clause to evaluate conditions based on data retrieved from other tables. SELECT case when (select count(*) from table1) > 0 then (select * from table1) when (select count(*) from table2) > 0 and (select count(*) from table3) > 0 then (select * from table3) else null end as result_table; This query checks if table1 has any rows. If yes, it selects all columns from table1. Otherwise, it checks if both table2 and table3 have rows. If both have data, it selects all columns from table3. If none of the conditions are met, it returns null.
... View more
05-12-2024
01:32 AM
1 Kudo
@Marks_08 1. Verify if any firewalls are blocking incoming connections on ports 10000 (HiveServer2) and 10002 (Thrift server). You can use tools like netstat -atup or lsof -i :10000 to check if any processes are listening on these ports. If a firewall is restricting access, configure it to allow connections on these ports from the machine where you're running Beeline. 2. Double-check the HiveServer2 configuration files (hive-site.xml and hive-env.sh) in Cloudera Manager. Ensure that the hive.server2.thrift.port property is set to 10000 in hive-site.xml. Verify that the HIVESERVER2_THRIFT_BIND_HOST environment variable (if set) in hive-env.sh allows connections from your Beeline machine. Make sure the HiveServer2 service has the necessary permissions to bind to these ports. 3. beeline -u jdbc:hive2://<HOST>:10000/;principal=hive/USR@PWD (specifies principal) 4. Try restarting the Hive and HiveServer2 services in Cloudera Manager. This can sometimes resolve conflicts or configuration issues. 5. Check the HiveServer2 log files (usually in /var/log/hive-server2/hive-server2.log) for any error messages that might indicate why it's not listening on the expected ports.
... View more
05-10-2024
07:31 AM
1 Kudo
@snm1523 check if this doc can help https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/upgrade-cdp/topics/ug_cdh_upgrade_cdp2cdp_post.html
... View more
05-08-2024
09:43 PM
@Lorenzo The error message "identity[myaduser], groups[] does not have permission to access the requested resource" indicates that while Kerberos authentication is successful, your user myaduser lacks the necessary permissions to access the specific NiFi flow you're targeting in API call N.2. 1. Verify User Permissions in NiFi: Access the NiFi UI and navigate to the specific flow you're trying to modify. Go to the "Policies" tab. Ensure "myaduser" has the appropriate read/write permissions on the flow or specific process group. You might need to add the user to a group with the required permissions. 2. Check Ranger Policies (if applicable): If you're using Apache Ranger for authorization in your Cloudera cluster, there might be Ranger policies restricting access to the NiFi flow. Review Ranger policies for NiFi resources. Verify if any policies specifically deny access to the flow or process group for "myaduser" or its groups. 3. Kerberos Service Principal Configuration: Double-check the Kerberos service principal configured for NiFi. Ensure the service principal used for authentication has the necessary permissions in Ranger or NiFi authorization policies. 4. Testing with a More Privileged User: Try using a user with known administrative privileges in NiFi to perform the API call N.2. If the call succeeds with the privileged user, it confirms the issue lies with "myaduser" permissions.
... View more
05-01-2024
10:28 PM
1 Kudo
@VenkataAvinash The error you're encountering (java.lang.RuntimeException: org.apache.storm.thrift.TApplicationException: Internal error processing submitTopologyWithOpts) indicates that there's an issue with submitting the Storm topology, but it doesn't directly point to the specific cause. However, based on your configuration and the error message, it seems like there might be an issue with the Kerberos authentication setup or configuration for the Storm Nimbus service. =>Review Kerberos Configuration: Double-check the Kerberos configuration for Storm Nimbus and ensure that it matches the settings in your storm.yaml file. Verify that the Kerberos principal (hdfs/hari-cluster-test1-master0.avinash.ceje-5ray.a5.cloudera.site@AVINASH.CEJE-5RAY.A5.CLOUDERA.SITE) and keytab file (/root/hdfs.keytab) are correctly specified. =>Check Keytab Permissions: Ensure that the keytab file /root/hdfs.keytab has the correct permissions set and is accessible by the Storm Nimbus service. =>Verify Service Principals: Confirm that the Kerberos principal (hdfs/hari-cluster-test1-master0.avinash.ceje-5ray.a5.cloudera.site@AVINASH.CEJE-5RAY.A5.CLOUDERA.SITE) is correctly configured for the Storm Nimbus service and that it has the necessary permissions to access HDFS. =>Check Nimbus Logs: Check the Nimbus logs (nimbus.log) for any additional error messages or stack traces that might provide more insight into the issue. =>Classpath Issues:Confirm that the versions of Storm, HDFS, and Kerberos libraries on your cluster are compatible with each other. Refer to the documentation for each component for known compatibility issues. =>Try submitting a simpler topology without the HDFS bolt initially to see if the basic Kerberos configuration works. This can help isolate the issue further. =>Consider using a tool like klist to verify if your user has successfully obtained a Kerberos ticket before submitting the topology.
... View more
05-01-2024
10:23 PM
1 Kudo
@wallacei Error: sqlline-thin.py is configured to use Protobuf serialization for communication with PQS. Protobuf relies on pre-defined class names to parse responses from the server. The error message suggests that sqlline-thin.py is unable to find the class name for a specific response message from PQS. =================== Check PQS Configuration: Ensure PQS is configured to use Protobuf serialization as well. This might involve checking configuration files or options during PQS startup. Verify Library Versions: Make sure the versions of sqlline-thin.py and the Phoenix libraries (including PQS) are compatible. Inconsistent versions might lead to class name mismatch issues. You can check the documentation for sqlline-thin.py for specific version compatibility information. Consider sqlline.py (Regular JDBC): As your sqlline.py script works with regular JDBC, it suggests the basic Phoenix connection is functional. You might consider using sqlline.py for now while troubleshooting the Protobuf issue with sqlline-thin.py. Alternative Tools: If sqlline-thin.py continues to cause problems, explore alternative tools for connecting to Phoenix like the Phoenix JDBC thin client or a GUI client like Squirrel SQL. Double-check the connection URL in sqlline-thin.py. Ensure it points to the correct PQS endpoint (http://localhost:8765 by default).
... View more
05-01-2024
10:18 PM
1 Kudo
@VTHive Assuming you have a table named your_table with a column named condition, you can extract the variable names using SQL: SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(condition, '=', 1), ' ', -1) AS variable_name FROM your_table UNION SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(condition, ' in ', 1), ' ', -1) AS variable_name FROM your_table WHERE condition LIKE '% in %' UNION SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(condition, '=', 1), ' ', -1)) AS variable_name FROM your_table WHERE condition LIKE '% ne %'; The query will extract the variable names from the conditions in the condition column of your table. It handles conditions with =, in, and ne operators. Adjust the table and column names accordingly to fit your actual schema
... View more