Support Questions

Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Java Spark issues casting/converting struct to map from JSON data before insert to HIVE

New Contributor

I am loading a JSON file with spark in order to insert it into Hive, this works very well.

Dataset<Row> testjson ="file:///root/test.json").withColumn("timestamp", new Column("timestamp").cast("timestamp")); 

My JSON look like this (I simplified it for the purpose of readability):

"name": "joe", "age": 30, "hair": "brown", "knowledge": {"java": "average", "php": "good"},

and so on...

As you can see this JSON has something that is by default inserted as a struct:

"knowledge": {"java": "average", "php": "good"} 

Now to the problem: I want the knowledge part of my JSON to be inserted as map<string, string> to hive instead of as it is now: struct<java:string,php:string>). I thought I can do like this

.withColumn("knowledge", new Column("knowledge").cast("Map")); //Map or Map<String, String> or equal but this is not working as struct cannot be casted to map.

This has been disturbing me for a while now and I cannot find a solution to it. I would therefore appreciate help a lot!

Please find the whole code:

  public static void main(String[] args) {<br>
    SparkSession spark = SparkSession
            .appName("Test Spark")
            .config("hive.metastore.uris", "thrift://localhost:9083")

    //Here I cast timestamp in my JSON to timestamp in hive, working good
    Dataset<Row> testjson ="file:///root/test.json").withColumn("timestamp", new Column("timestamp").cast("timestamp"));

    Dataset<Row> showAll = spark.sql("SELECT * FROM testjson");

New Contributor

I really have the same problem. Does anyone know the solution?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.