I have a process which will run Spark , convert the required data in different DF's based on the input and store it to 8 - 10 different tables. ( single input file has data for multiple Tables ).
Now i am trying to run update statement in the table which spark inserts data which is causing a lot of issues. (array out of bound index )
So would like to understand if there can be a difference between Hive insert and Spark insert (Spark version 1.6.3 ) which is causing this issue.
I have tried to insert the same data in different table created from hive and insert the data. When run the updated it finished without any issues(Bucketing partition all remains same as the table which spark inserts)