Created 03-27-2019 02:50 PM
Hi Team,
I am trying following SQL:-
var work__store_level_vend_pack_loc_final_data =
sparksession.read.format("csv") .option("header", "true") .option("delimiter", "|") .option("inferSchema", "true") .load("C:\\Users\\jabin\\Desktop\\project_files\\work__store_level_vend_pack_loc_final_data.txt"); work__store_level_vend_pack_loc_final_data.registerTempTable("work__store_level_vend_pack_loc_final_data_table");
var r1 = sparksession.sqlContext.sql( "SELECT shc_item_id ,'K' as source_owner_cd,item_purchase_status_cd, vendor_package_id,vendor_package_purchase_status_cd,flow_type_cd as vendor_package_flow_type_cd,vendor_carton_qty,vendor_stock_nbr,ksn_package_id,ksn_purchase_status_cd,import_ind,sears_divission_nbr,sears_item_nbr,sears_sku_nbr,scan_based_trading_ind,cross_merchandising_cd,retail_carton_vendor_package_id,vendor_package_owner_cd,can_carry_model_id,'' AS days_to_check_begin_day_qty,'' AS days_to_check_end_day_qty ,dotcom_allocation_ind ,retail_carton_internal_package_qty,allocation_replenishment_cd,shc_item_type_cd,idrp_order_method_cd,source_package_qty as store_source_package_qty,order_duns_nbr FROM work__store_level_vend_pack_loc_final_data_table WHERE flow_type_cd = 'JIT' OR servicing_dc_nbr > '0' ")
// .collect.foreach(println)
now i want to distinct all column of the r1 using sparksession.sqlContext.sql("")
how to do above thing?
Created 04-19-2019 01:11 AM
You can achieve different ways.
Using DataframeAPI:
var work__store_level_vend_pack_loc_final_data = sparksession.read.format("csv") .option("header", "true") .option("delimiter", "|") .option("inferSchema", "true") .load("C:\\Users\\jabin\\Desktop\\project_files\\work__store_level_vend_pack_loc_final_data.txt"); work__store_level_vend_pack_loc_final_data .select("*") .distinct() //distinct on all columns .registerTempTable("work__store_level_vend_pack_loc_final_data_table") //register temp table
(or)
Using sparkSQL:
var work__store_level_vend_pack_loc_final_data = sparksession.read.format("csv") .option("header", "true") .option("delimiter", "|") .option("inferSchema", "true") .load("C:\\Users\\jabin\\Desktop\\project_files\\work__store_level_vend_pack_loc_final_data.txt"); work__store_level_vend_pack_loc_final_data.registerTempTable("work__store_level_vend_pack_loc_final_data_table"); sparksession.sqlContext.sql("select distinct * from work__store_level_vend_pack_loc_final_data_table").show(20,false) //distinct of all columns