- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
BigQuery schema issue
- Labels:
-
Apache NiFi
Created ‎10-21-2019 03:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HI
I am using the nifi processor PutBigQueryBatch to put data into a Google Cloud BigQuery table.
However, I am getting the below error with regards to the schema I have defined in the properties
Below is the schema definition I have in the properties of the process
[
{
"name": "checknumber",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "Altertions",
"type": "RECORD",
"mode": "REPEATED",
"fields": [
{
"name": "Alteration_Product_Qty",
"type": "FLOAT",
"mode": "NULLABLE"
},
{
"name": "Alteration_Product_Name",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "Altered_Product_Name",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "Alteration_Product_Code",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "Altered_Product_Code",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "Alteration_Product_Net_Amount",
"type": "FLOAT",
"mode": "NULLABLE"
},
{
"name": "Alteration_Product_Amount",
"type": "FLOAT",
"mode": "NULLABLE"
}
]
},
{
"name": "dob",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "StoreId",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "Order_Mode_Id",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "checkClosetime",
"type": "TIMESTAMP",
"mode": "NULLABLE"
},
{
"name": "Revenue_Id",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "Alteration_Flag",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "Revenue_Label",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "Order_Mode_Label",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "StoreName",
"type": "STRING",
"mode": "NULLABLE"
}
]
and here is the definition of the table in BigQuery
I think it has something to do with how I have defined the Alterations array in the properties but can't figure out what.
Does anyone know how to define the array correctly in the schema definition in the properties of the PutBigQueryBatch process?
Thanks
Tim
Created ‎11-03-2019 05:29 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
Sadly I have still not solved this issue so hopefully some more information and a list of what I have been trying without success will help.
Below is the json file in my flow file
{
"Revenue_Label": "Dining Room",
"StoreName": "STORE A",
"Revenue_Id": "1",
"Alteration_Flag": "False",
"Order_Mode_Label": "DriveThru",
"checkClosetime": "2019-10-24T13:43:19+13:00",
"Alterations": [ {
"Alteration_Product_Code": "211136",
"Alteration_Product_Net_Amount": 0.0,
"Altered_Product_Code": "211135",
"Alteration_Product_Amount": 0.0,
"Altered_Product_Name": "Burger",
"Alteration_Product_Name": "Add Sauce",
"Alteration_Product_Qty": 1.0 } ],
"StoreId": "1234",
"dob": "20191024",
"Order_Mode_Id": "3",
"checknumber": "54321"}
Below is the schema I am now using
[ {"mode": "NULLABLE","name": "Revenue_Label","type": "STRING"}, {"mode": "NULLABLE","name": "StoreName","type": "STRING"}, {"mode": "NULLABLE","name": "Revenue_Id", "type": "STRING" }, {"mode": "NULLABLE","name": "Alteration_Flag","type": "STRING"}, {"mode": "NULLABLE","name": "Order_Mode_Label","type": "STRING"}, {"mode": "NULLABLE","name": "checkClosetime","type": "TIMESTAMP" }, {"mode": "REPEATED", "name": "Altertions", "type": "RECORD", "fields": [ {"mode": "NULLABLE","name": "Alteration_Product_Code", "type": "STRING" }, {"mode": "NULLABLE", "name": "Alteration_Product_Net_Amount", "type": "FLOAT" }, {"mode": "NULLABLE", "name": "Altered_Product_Code","type": "STRING" }, {"mode": "NULLABLE", "name": "Alteration_Product_Amount", "type": "FLOAT" }, {"mode": "NULLABLE", "name": "Altered_Product_Name", "type": "STRING" }, {"mode": "NULLABLE", "name": "Alteration_Product_Name", "type": "STRING" }, {"mode": "NULLABLE","name": "Alteration_Product_Qty", "type": "FLOAT" } ] }, {"mode": "NULLABLE","name": "StoreId","type": "STRING"}, {"mode": "NULLABLE", "name": "dob","type": "STRING"}, {"mode": "NULLABLE","name": "Order_Mode_Id","type": "STRING"}, {"mode": "NULLABLE","name": "checknumber","type": "STRING" } ]
What I have tried:
- Removed all the white spaces and carriage return characters from the json flowflie
- Tried the array at the start and end of the json file and matched the BigQuery table to this.
- Changed the Names of the items to not include underscores.
- Built the insert one column at a time and only fails once the array is present.
- Used the schema definition generated from Google Cloud BigQuery directly when using the utility "bq show --schema --format=prettyjson.
- Defined the mode for each of the array items to REQUIRED instead of NULLABLE.
- Removed all the white spaces and carriage return characters from the schema definition in the properties of the PutBigQueryBatch process in Nifi
- Within Google BigQuery I created a table manually by importing the above json file and and having it auto detect the schema. Then pointed to this table in nifi
- Within Google BigQuery I created a table manually by importing the above json file and and having it defining the schema myself using the above definition. Then pointed to this table in nifi
All resulted in getting the same error as stated above
Note Elsewhere in my nifi process I successfully use the PutBigQueryBatch process and update the table. The difference being that json does not have an array present.
I really am at loss on how to solve this issue.
Thanks for any help forthcoming
Tim
Created ‎11-18-2021 06:13 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm facing the same problem, has anyone got a solution?
