Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Pyodbc Impala not able to find external hive tables created.

avatar
Explorer

Hello community,

 

I'm writing a Python script, and connect through the 64-bit odbc driver to Hive and Impala. I'm interested in creating an external table using the Hive connection, and then run some faster-than-hive queries using an Impala connection. After table creation, I am able to see and query the external tables in both hive and impala editors in HUE. However, these newly created tables are not visible nor available through the Impala connection.

 

Tried the 

 

invalidate metadata <table_name>

 

but it's not working either.

 

Any pointers would be greatly appreciated.

 

Luis

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hi Luis,

There are a couple of suggestions I can make to help narrow down what the cause may be.

  • Are you able to run queries against these tables in Hue using both Impala and Hive? More than just seeing the tables in the navigation on the left are you actually able to SELECT from them using both engines?
  • Are you using the same user in both Hue and the ODBC connection? I ask because if user is not allowed to access a table by Sentry then that table will not appear when you run SHOW TABLES.
  • Have you run SELECT <db_name> in your ODBC SEL session prior to running SHOW TABLES or running a SQL query?
  • Have you tried a SELECT statement on the table even if the metadata is not showing when you run SHOW TABLES? Try prefixing the database name to the table e.g. db_name.table_name.

As always you should also check the logs to see if there are any clues as to why it is not working. Check to ensure that the version of the ODBC driver is compatible with the version of Impala you are using.

 

Kind regards,

Jim

 

 

View solution in original post

6 REPLIES 6

avatar
Expert Contributor

Hi Luis,

There are a couple of suggestions I can make to help narrow down what the cause may be.

  • Are you able to run queries against these tables in Hue using both Impala and Hive? More than just seeing the tables in the navigation on the left are you actually able to SELECT from them using both engines?
  • Are you using the same user in both Hue and the ODBC connection? I ask because if user is not allowed to access a table by Sentry then that table will not appear when you run SHOW TABLES.
  • Have you run SELECT <db_name> in your ODBC SEL session prior to running SHOW TABLES or running a SQL query?
  • Have you tried a SELECT statement on the table even if the metadata is not showing when you run SHOW TABLES? Try prefixing the database name to the table e.g. db_name.table_name.

As always you should also check the logs to see if there are any clues as to why it is not working. Check to ensure that the version of the ODBC driver is compatible with the version of Impala you are using.

 

Kind regards,

Jim

 

 

avatar
Explorer

Hi Jim,

 

Thank you for the quick response.

 

  • Are you able to run queries against these tables in Hue using both Impala and Hive? More than just seeing the tables in the navigation on the left are you actually able to SELECT from them using both engines?
    • YES
  • Are you using the same user in both Hue and the ODBC connection? I ask because if user is not allowed to access a table by Sentry then that table will not appear when you run SHOW TABLES.
    • Using a service account that has full access for both Impala and Hive. Same service account that creates the table in Hive.
  • Have you run SELECT <db_name> in your ODBC SEL session prior to running SHOW TABLES or running a SQL query?
    • All I'm doing is within a single DB, specified in the connection string. I have tried querying using the FROM <db.table> notation and simply the table name without luck. 
  • Have you tried a SELECT statement on the table even if the metadata is not showing when you run SHOW TABLES? Try prefixing the database name to the table e.g. db_name.table_name.
    • ^^

avatar
Explorer

@Jim Halfpenny 

 

This is the message that I get:

 

ProgrammingError: ('42000', "[42000] [Cloudera][ImpalaODBC] (360) Syntax error occurred during query execution: [HY000] : AnalysisException: Could not resolve table reference: 'rpm_usawt.repl_ionprogress_patient_assignment'\n (360) (SQLExecDirectW)")

avatar
Expert Contributor

Hi Luis,

If you run SHOW TABLES in the Impala ODBC session do you see the list of tables? Are only the external tables missing? Can you try running SHOW CURRENT ROLES to see if the lost of Sentry roles matches what you expect. I would also recommend checking the Impala logs, these will shed some more light on whether the table is not visible or if you are being denied access.

 

Kind regards,

Jim

avatar
Explorer

Yes, only the recently created tables are missing. When I run SHOW CURRENT ROLES I get: 

 

s_cfyu_aim_admin

 

Which should have all credentials in place.

This is what I get in the Logs:

 

I0302 17:10:54.746717 45191 webserver.cc:361] Webserver: error reading: Resource temporarily unavailable
I0302 17:10:54.994223  9563 thrift-util.cc:123] TAcceptQueueServer: Caught TException: No more data to read.

 

They are not very telling to me...

 

Thank you. 

avatar
Explorer

Sorry for the late response. After giving up for a few days, I realized my mistake was that in the .odbc.ini file where I have my Hive and Impala DSNs, Hive host was pointing to dev and Impala host to prod.

 

Silly me.

 

Thank you for the quick response community.