Reply
New Contributor
Posts: 1
Registered: ‎08-01-2016

I want to index data(csv) from a dirctory that on hdfs automatically with solr!

hi!

I want to index data(csv) from a dirctory that on hdfs automatically with solr!

So how to do it ?

ps:cdh5/4

Highlighted
Contributor
Posts: 56
Registered: ‎02-09-2015

Re: I want to index data(csv) from a dirctory that on hdfs automatically with solr!

[ Edited ]

you can use solr hadoop connector from lucid works , below is a sample command using this connector

1- create a path on hdfs to put your csv file/files inside i.e "csv_directory/"

2 - download the connector and use this command , (update the csv structure and zookeeper configuration and solr core name)
hadoop jar /opt/lucidworks-hdpsearch/job/lucidworks-hadoop-job-2.0.3.jar com.lucidworks.hadoop.ingest.IngestJob -DcsvFieldMapping=0=id,1=cat,2=name,3=price,4=instock,5=author -DcsvFirstLineComment -DidField=id -DcsvDelimiter="," -Dlww.commit.on.close=true -cls com.lucidworks.hadoop.ingest.CSVIngestMapper -c solr_core_name -i csv_directory/* -of com.lucidworks.hadoop.io.LWMapRedOutputFormat -zk localhost:2181