About ChethanYM

ChethanYM · ‎08-29-2025

@AEAT The log message "ORC read request to already read range. Falling back to readRandom" is a sign of a suboptimal read pattern. While not a fatal error, it means Impala is not reading the ORC file as efficiently as it could. Impala's ORC scanner is designed to read data in a sequential, read-ahead fashion to optimize I/O from HDFS. It attempts to predict what data a query will need next and reads it in large, efficient chunks. -> Random reads are slower than sequential reads on both spinning disks and SSDs. -> The process of seeking to a different location in the file and reading a small chunk of data consumes more CPU resources. -> The cumulative effect of these inefficient reads can add significant time to a query's execution, especially for large datasets. The most common cause of this issue is a large number of small files. Impala has to make many I/O requests to process each file, which can disrupt the efficient read pattern. Please check if you have such pattren of files and compress it as per the hdfs block size. Manually monitor the resources usage while running the query. Regards, Chethan YM

ChethanYM · ‎08-28-2025

@Pratibha123 ORA-12705 - This is the key error. The Oracle JDBC driver needs to set up a language and character set session with the Oracle database. To do this, it attempts to read NLS configuration data from files on the local filesystem of the machine where Sqoop is running. This error occurs because it cannot find or access those NLS data files, or the environment variable that points to them is invalid. ORA-00604: This is a cascading error. It means an internal, recursive SQL statement that Oracle runs during the connection/session setup failed, because the session setup itself was incomplete. References: https://docs.oracle.com/en/error-help/db/ora-12705/?r=23ai https://stackoverflow.com/questions/7700330/ora-12705-cannot-access-nls-data-files-or-invalid-environment https://docs.oracle.com/en/error-help/db/ora-00604/?r=23ai https://stackoverflow.com/questions/30478070/how-to-solve-sql-error-ora-00604-error-occurred-at-recursive-sql-level-1 "oracle.jdbc.NLS_LANG" Is not seems to be valid property. Can you export it before running the job and check? > export NLS_LANG="AMERICAN_AMERICA.AL32UTF8" Ensure libjars has the correct ojdbc8.jar and it's compatible with your Oracle DB version. Regards, Chethan YM

ChethanYM · ‎08-05-2025

@Pratibha123 ORA-12705 - This is the key error. The Oracle JDBC driver needs to set up a language and character set session with the Oracle database. To do this, it attempts to read NLS configuration data from files on the local filesystem of the machine where Sqoop is running. This error occurs because it cannot find or access those NLS data files, or the environment variable that points to them is invalid. ORA-00604: This is a cascading error. It means an internal, recursive SQL statement that Oracle runs during the connection/session setup failed, because the session setup itself was incomplete. References: https://docs.oracle.com/en/error-help/db/ora-12705/?r=23ai https://stackoverflow.com/questions/7700330/ora-12705-cannot-access-nls-data-files-or-invalid-environment https://docs.oracle.com/en/error-help/db/ora-00604/?r=23ai https://stackoverflow.com/questions/30478070/how-to-solve-sql-error-ora-00604-error-occurred-at-recursive-sql-level-1 "oracle.jdbc.NLS_LANG" Is not seems to be valid property. Can you export it before running the job and check? > export NLS_LANG="AMERICAN_AMERICA.AL32UTF8" Ensure libjars has the correct ojdbc8.jar and it's compatible with your Oracle DB version. Regards, Chethan YM

ChethanYM · ‎05-07-2025

@Yigal It is not supported in Impala, Below is the Jira for your reference it is still in open state and not Resolved. https://issues.apache.org/jira/browse/IMPALA-5226 Regards, Chethan YM

ChethanYM · ‎05-05-2025

Hi @Rich_Learner can you try this: "SELECT get_json_object(product_json, '$.ProductCOde') AS product_code, get_json_object(product_json, '$.Type') AS product_type FROM customer_table LATERAL VIEW json_tuple(json_column, 'Customer') c AS customer_json LATERAL VIEW json_tuple(customer_json, 'products') p AS products_json LATERAL VIEW explode(from_json(products_json, 'array<map<string,string>>')) product_table AS product_json ;"" OR WITH cleaned_json AS ( SELECT regexp_replace( regexp_replace( get_json_object(json_column, '$.Customer.products'), '\\}\\s*,\\s*\\{', '}~{' ), '\\[|\\]', '' ) AS flat_products FROM customer_table ), split_json AS ( SELECT split(flat_products, '~') AS product_array FROM cleaned_json ) SELECT get_json_object(item, '$.ProductCOde') AS product_code, get_json_object(item, '$.Type') AS product_type FROM split_json LATERAL VIEW explode(product_array) exploded_table AS item; Ensure your JSON keys match case-sensitively and Use consistent JSON structure. If offer is both a number and an array in different objects, consider preprocessing or cleaning up such inconsistencies. Regards, Chethan YM

ChethanYM · ‎05-05-2025

Hi @rdhau You can go through the below Cloudera documentations completely to understand to work with HWC. https://docs.cloudera.com/cdp-private-cloud-base/7.1.8/integrating-hive-and-bi/topics/hive_hivewarehouseconnector_for_handling_apache_spark_data.html https://docs.cloudera.com/cdp-private-cloud-base/7.3.1/integrating-hive-and-bi/topics/hive-hwc-reader-mode.html https://docs.cloudera.com/cdw-runtime/1.5.4/hive-metastore/topics/hive_apache_spark_hive_connection_configuration.html Regards, Chethan YM

ChethanYM · ‎02-13-2025

Hi, I have not tested it may be you can try something like this and see if that works. curl -X POST "https://cdp.company.com/gateway/cdp-proxy-api/Impala/api/v1/query" \ -H "Authorization: Bearer YOUR_ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "query": "SELECT * FROM sales_data WHERE date >= current_date - interval 7 days;", "database": "analytics_db", "async": false }' Cloudera API overview: https://docs.cloudera.com/cdp-public-cloud/cloud/api/topics/mc-api-overview.html?utm_source=chatgpt.com Regards, Chethan YM

ChethanYM · ‎02-13-2025

@ysong26 The "out of sequence response" error suggests that the Impala JDBC driver expected a specific response sequence but received an unexpected one. This can happen due to network latency, dropped packets, or an issue in the client-server communication. See if you have set high value for "defaultRowBatchSize" property, If yes you can lower it and check which JDBC driver version you are using try to use the latest version available and see if that helps. Also verify if any network issues in the system. Regards, Chethan YM

ChethanYM · ‎12-11-2024

Hi @sayebogbon Could you restart the CM agent on the hosts where Impala daemon is in bad health and also restart service monitor from CM and check it out? Regards, Chethan YM

ChethanYM · ‎11-21-2024

Hi @mrblack To avoid full table scan you follow these tips: 1. Ensure proper partition pruning: https://impala.apache.org/docs/build/html/topics/impala_partitioning.html#:~:text=the%20impalad%20daemon.-,Partition%20Pruning%20for%20Queries,-Partition%20pruning%20refers 2. Re write the query with sub queries. 3. Add explicit hints for join behaviour. Impala supports join hints like brodcast and shuffle that can influence query planning. After optimising check the explain plan. Regards, Chethan YM

Online	Offline
Last Visited	‎08-28-2025 09:14 PM

Member Since	‎03-06-2020 05:48 AM
Last Visited	‎08-28-2025 09:14 PM
Posts	406
Kudos received	55

Cloudera Community

Re: Impala's log "ORC read request to already read...

Re: impala forces full table scan

Re: Sqoop export fails from hive to oracle when co...

Re: Multi Node Hadoop Cluster setup with Hbase and...

Re: Hive connecting to node that does not exist

Re: Impala's log "ORC read request to already read...

Re: Sqoop command is failing for Oracle database o...

Re: Sqoop command is failing for Oracle database o...

Re: ERROR: AnalysisException: Subqueries in OR pre...

Re: JSON parent class extraction from Hadoop table

Re: Need to access hive ACID table using spark.sql...

Re: Submit query via the Impala api service on the...

Re: Impala query error

Re: Cloudera Manager Agent is not able to communic...

Re: impala forces full table scan