Hello,
here is a way to do it using pyspark, it may be not optimal.
I used this csv to test my code
column1,column2,column3
row1-1|row1-2|row1-3
row2-1|row2-2|row2-3
row3-1|row3-2|row3-3
Load the header only, giving the dataframe structure
header_dataframe = spark.read.format("csv").option("header", "true").load('/tmp/test.csv').limit(0)
+-------+-------+-------+
|column1|column2|column3|
+-------+-------+-------+
+-------+-------+-------+
Load the data as RDD, remove the first line and convert it to dataframe
data_rdd = sc.textFile('/tmp/test.csv')
header_row =data_rdd.first()
data_rdd = data_rdd.filter(lambda row:row != header_row)
data_dataframe = data_rdd.map(lambda x: x.split("|")).toDF()
+------+------+------+
| _1| _2| _3|
+------+------+------+
|row1-1|row1-2|row1-3|
|row2-1|row2-2|row2-3|
|row3-1|row3-2|row3-3|
+------+------+------+
Append the dataframe containing the data to the dataframe holding the structure
dataframe = header_dataframe.union(data_dataframe)
+-------+-------+-------+
|column1|column2|column3|
+-------+-------+-------+
| row1-1| row1-2| row1-3|
| row2-1| row2-2| row2-3|
| row3-1| row3-2| row3-3|
+-------+-------+-------+