Support Questions

JN_000 · ‎07-12-2023

Hello. I am trying to run explorative Pyspark code within a Databricks ApacheSpark environs. I am pretty sure this syntax is correct, but the subject-referenced apache.spark error keeps throwing. Any insights, please?

databaseName = "database"
desiredColumn = "variable"
database = spark.sql(f"show tables in {databaseName} ").collect()
display(database)

tablenames = []
for row in database:
cols = spark.table(row.tableName).columns
listColumns= spark.table(row.tableName).columns

if desiredColumn in listColumns:
tablenames.append(row.tableName)

DianaTorres · ‎07-12-2023

@JN_000 Welcome to the Cloudera Community!

To help you get the best possible solution, I have tagged our Spark expert @Bharati who may be able to assist you further.

Please keep us updated on your post, and we hope you find a satisfactory solution to your query.

Regards,

Diana Torres,
Community Moderator

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Community Guidelines
How to use the forum

JN_000 · ‎07-12-2023

Thank you so much!

ggangadharan · ‎07-20-2023

We verified the same in the CDP environment, as we are uncertain about the Databricks Spark environment.

As we have mixed of managed and external tables , extracted the necessary information through HWC.

>>> database=spark.sql("show tables in default").collect()
23/07/20 10:04:45 INFO rule.HWCSwitchRule: Registering Listeners
23/07/20 10:04:47 WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
Hive Session ID = e6f70006-0c2e-4237-9a9e-e1d19901af54
>>> desiredColumn="name"
>>> tablenames = []
>>> for row in database:
...  cols = spark.table(row.tableName).columns
...  listColumns= spark.table(row.tableName).columns
...  if desiredColumn in listColumns:
...   tablenames.append(row.tableName)
...
>>>
>>> print("\n".join(tablenames))
movies
tv_series_abc
cdp1
tv_series
spark_array_string_example
>>>

Support Questions

Databricks Error Inquiry: org.apache.spark.SparkException: Unable to fetch tables of db default