Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to merge or transpose spark data frames to get the output in expected schema?

How to merge or transpose spark data frames to get the output in expected schema?

Expert Contributor

Hi All,

 

We have few data frames as given below

 

dataframe 1:
+----------+-----------------+----------------+-----------+--------------+
|id_minimum|firstname_minimum|lastname_minimum|dob_minimum|salary_minimum|
+----------+-----------------+----------------+-----------+--------------+
|        16|           satish|            null| 2014-12-11|5.6788990123E7|
+----------+-----------------+----------------+-----------+--------------+

dataframe 2:

+---------------+----------------------+---------------------+----------------+-------------------+
|id_unique_count|firstname_unique_count|lastname_unique_count|dob_unique_count|salary_unique_count|
+---------------+----------------------+---------------------+----------------+-------------------+
|              3|                     3|                    2|               3|                  3|
+---------------+----------------------+---------------------+----------------+-------------------+

dataframe 3: 



+----------+-----------------+----------------+-----------+--------------+
|id_maximum|firstname_maximum|lastname_maximum|dob_maximum|salary_maximum|
+----------+-----------------+----------------+-----------+--------------+
|        16|           satish|            null| 2014-12-11|5.6788990123E7|
+----------+-----------------+----------------+-----------+--------------+

dataframe 4:

+--------------+---------------------+--------------------+---------------+------------------+
|id_nulls_count|firstname_nulls_count|lastname_nulls_count|dob_nulls_count|salary_nulls_count|
+--------------+---------------------+--------------------+---------------+------------------+
|             3|                    3|                   3|              3|                 3|
+--------------+---------------------+--------------------+---------------+------------------+

 

is there any way to convert or do transpose of data frames into single data frame and produce the output as expected schema format?
Expected schema format :
column name,,null_count,blank_count,min,max,unique_count,duplicate_count

sample data:
id,3,45,56,65,67,2
first_name,3,45,arjun,zebra,56,3

last_name,3,45,henrick,xerox,67,5

Thanks in advance.





Thanks in advance

Don't have an account?
Coming from Hortonworks? Activate your account here