Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

BigQuery schema issue

avatar
Explorer

HI

I am using the nifi processor PutBigQueryBatch to put data into a Google Cloud BigQuery table.

However, I am getting the below error with regards to the schema I have defined in the properties

BigQuery Schema Error.PNG

 

Below is the schema definition I have in the properties of the process

[
{
"name": "checknumber",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "Altertions",
"type": "RECORD",
"mode": "REPEATED",
"fields": [
{
"name": "Alteration_Product_Qty",
"type": "FLOAT",
"mode": "NULLABLE"
},
{
"name": "Alteration_Product_Name",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "Altered_Product_Name",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "Alteration_Product_Code",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "Altered_Product_Code",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "Alteration_Product_Net_Amount",
"type": "FLOAT",
"mode": "NULLABLE"
},
{
"name": "Alteration_Product_Amount",
"type": "FLOAT",
"mode": "NULLABLE"
}
]
},
{
"name": "dob",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "StoreId",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "Order_Mode_Id",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "checkClosetime",
"type": "TIMESTAMP",
"mode": "NULLABLE"
},
{
"name": "Revenue_Id",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "Alteration_Flag",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "Revenue_Label",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "Order_Mode_Label",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "StoreName",
"type": "STRING",
"mode": "NULLABLE"
}
]

 

and here is the definition of the table in BigQuery

BigQuery Schema Definition.PNG

 

I think it has something to do with how I have defined the Alterations array in the properties but can't figure out what.

Does anyone know how to define the array correctly in the schema definition in the properties of the PutBigQueryBatch process?

 

Thanks

Tim

2 REPLIES 2

avatar
Explorer

Hi 

Sadly I have still not solved this issue so hopefully some more information and a list of what I have been trying without success will help.

 

Below is the json file in my flow file

 

{    
"Revenue_Label": "Dining Room",     
"StoreName": "STORE A",     
"Revenue_Id": "1",     
"Alteration_Flag": "False",     
"Order_Mode_Label": "DriveThru",     
"checkClosetime": "2019-10-24T13:43:19+13:00",     
"Alterations": [        {            
"Alteration_Product_Code": "211136",             
"Alteration_Product_Net_Amount": 0.0,             
"Altered_Product_Code": "211135",             
"Alteration_Product_Amount": 0.0,             
"Altered_Product_Name": "Burger",             
"Alteration_Product_Name": "Add Sauce",             
"Alteration_Product_Qty": 1.0        }    ],     
"StoreId": "1234",    
 "dob": "20191024",     
 "Order_Mode_Id": "3",     
 "checknumber": "54321"}

 

Below is the schema I am now using 

[
  {"mode": "NULLABLE","name": "Revenue_Label","type": "STRING"},
  {"mode": "NULLABLE","name": "StoreName","type": "STRING"},
  {"mode": "NULLABLE","name": "Revenue_Id", "type": "STRING" },
  {"mode": "NULLABLE","name": "Alteration_Flag","type": "STRING"},
  {"mode": "NULLABLE","name": "Order_Mode_Label","type": "STRING"},
  {"mode": "NULLABLE","name": "checkClosetime","type": "TIMESTAMP" },
  {"mode": "REPEATED",    
   "name": "Altertions",    
   "type": "RECORD",    
     "fields": [      
                {"mode": "NULLABLE","name": "Alteration_Product_Code", "type": "STRING" },
                {"mode": "NULLABLE", "name": "Alteration_Product_Net_Amount", "type": "FLOAT" },
                {"mode": "NULLABLE", "name": "Altered_Product_Code","type": "STRING" },
                {"mode": "NULLABLE", "name": "Alteration_Product_Amount",  "type": "FLOAT" },
                {"mode": "NULLABLE", "name": "Altered_Product_Name", "type": "STRING" },
                {"mode": "NULLABLE", "name": "Alteration_Product_Name",  "type": "STRING" },
                {"mode": "NULLABLE","name": "Alteration_Product_Qty",   "type": "FLOAT"      } 
                ]  
    },    
  {"mode": "NULLABLE","name": "StoreId","type": "STRING"},
  {"mode": "NULLABLE", "name": "dob","type": "STRING"},
  {"mode": "NULLABLE","name": "Order_Mode_Id","type": "STRING"},
  {"mode": "NULLABLE","name": "checknumber","type": "STRING" }
 ]

 

What I have tried:

  1. Removed all the white spaces and carriage return characters from the json flowflie
  2. Tried the array at the start and end of the json file and matched the BigQuery table to this.
  3. Changed the Names of the items to not include underscores.
  4. Built the insert one column at a time and only fails once the array is present.
  5. Used the schema definition generated from Google Cloud BigQuery directly when using the utility "bq show --schema --format=prettyjson.
  6. Defined the mode for each of the array items to REQUIRED instead of NULLABLE.
  7. Removed all the white spaces and carriage return characters from the schema definition in the properties of the PutBigQueryBatch process in Nifi
  8. Within Google BigQuery I created a table manually by importing the above json file and and having it auto detect the schema. Then pointed to this table in nifi
  9. Within Google BigQuery I created a table manually by importing the above json file and and having it defining the schema myself using the above definition. Then pointed to this table in nifi

All resulted in getting the same error as stated above

Note Elsewhere in my nifi process I successfully use the PutBigQueryBatch process and update the table. The difference being that json does not have an array present.

 

I really am at loss on how to solve this issue.

Thanks for any help forthcoming

Tim

avatar
Explorer

I'm facing the same problem, has anyone got a solution?