Member since
08-13-2018
6
Posts
0
Kudos Received
0
Solutions
09-04-2018
03:18 PM
Hi I have a series of question related to sqoop 1)what is the difference between sqoop --incremental lastmodified --merge-key and sqoop merge tool.
2)suppose I have a dataset with 2 updates and 1 insert and I am performing sqoop merge by creating jar on old dataset using codegen tool.
what records will I get in my target dir? 3)If I am getting Updates,inserts,deletes and non change records(basically everything) but in new directory ..then whats the
use of sqoop merge . This can be achived by doing full load?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Sqoop
08-23-2018
06:54 AM
Hi Matt Burgess Thanks for your help. It worked and now I am able to convert csv into nested JSON. Thanks a lot.
... View more
08-21-2018
03:49 PM
csv file: "Foo","12","newyork","North avenue","123213"
"Foo1","12","newyork","North avenue","123213"
"Foo2","12","newyork","North avenue","123213" Output required: {
"studentName":"Foo",
"Age":"12",
"address__city":"newyork",
"address":{
"address__address1":"North avenue",
"address__zipcode":"123213"
}
} I am able to convert csv to simple json but not nested json using convertRecord Processor. please check my Avro schema:
{
"type" : "record",
"name" : "MyClass",
"namespace" : "com.test.avro",
"fields" : [ {
"name" : "studentName",
"type" : "string"
}, {
"name" : "Age",
"type" : "string"
}, {
"name" : "address__city",
"type" : "string"
}, {
"name" : "address",
"type" : {
"type" : "record",
"name" : "address",
"fields" : [ {
"name" : "address__address1",
"type" : "string"
}, {
"name" : "address__zipcode",
"type" : "string"
} ]
}
} ]
}
... View more
Labels:
- Labels:
-
Apache NiFi
08-14-2018
06:43 AM
Hi @Vikas Singh , SplitJson Processor is used to split Json Array and EvaluateJsonPath is used to extract Json fields as attribute or content. In your case: Step 1:Use EvaluateJsonPath Processor to extract info fields of Json. For example: is you want to extarct info fileds: .$info.seed,.$info.page,.$info.version,.$info.results and save it as Flowfile Attribute. Step 2:Use SplitJson Processor :$.results Step 3:USe EvaluateJsonPath Processor to extract Json Array fields.
... View more