Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Morphline ReadAvroParquetFile timestamp problem

SOLVED Go to solution
Highlighted

Morphline ReadAvroParquetFile timestamp problem

Explorer

Hi all, i'm using the following morphline to index some parquet files:

 

morphlines : [ { id : morphline1 importCommands : ["org.kitesdk.**","org.apache.solr.**"] commands : [ { readAvroParquetFile { projectionSchemaString : """ { "name": "record_parquet", "namespace": "parquet.avro", "type": "record", "fields": [ { "name": "id", "type":["null", "string"] }, { "name": "date_time", "type":["null", "string" ]}, { "name": "sessionid", "type": ["null","string" ]}, { "name": "client_id", "type": ["null","string" ]} ] } """ # supportedMimeTypes : [avro/binary] # projectionSchemaString : "" # optional, avro json schema blurb for getSchema() # projectionSchemaFile : /path/to/syslog.avsc } } { extractAvroPaths { flatten : true paths : { id : /id date_time : "/date_time" session_id : /sessionid client_id : /client_id } } } { addValues { # add values "text/log" and "text/log2" to the source_type output field Channel : [canal] } } { logDebug { format : "output record: {}", args : ["@{}"] } } # load the record into a Solr server or MapReduce Reducer { loadSolr { solrLocator : { collection : parquet_test # Name of solr collection zkHost : "$IP:2181/solr" # ZooKeeper ensemble batchSize : 1000 # batchSize } } } ] } ]

 

 

Everything is going well except the fact that the tdate_time field (already in solr format, and date type in the schema.xml) is converted in unix epoch format. Any idea about ? Thanks in advance

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Morphline ReadAvroParquetFile timestamp problem

Explorer
I resolved this issue.
There is no need of projectionSchemaString in readerAvroparquetFile.
So after removing it, everything was working.

Cheers.
1 REPLY 1

Re: Morphline ReadAvroParquetFile timestamp problem

Explorer
I resolved this issue.
There is no need of projectionSchemaString in readerAvroparquetFile.
So after removing it, everything was working.

Cheers.