Created on 06-24-2014 06:17 AM - edited 09-16-2022 02:00 AM
Hello ,
I have installed Cloudera manager 5 and using it I installed Solr , Zookeper , HDFS and Yarn services.
I am trying to do the following :
1. Load data to the HDFS
2. Access the HDFS using Solr .
Please suggest me steps to acheive the same .
Thanks
Bala
Created 06-25-2014 10:35 AM
Bala,
Follow steps:
Create a local Solr project directory and schema
Through
Viewing the results
This will have you setup a SOLR index in HDFS. You can use any CSV file for sample data.
Created 06-27-2014 06:44 AM
Hi Bala,
I have found that very rarely is data truly unstructured. What kind of data is it? Typically, there is some form of structure to the data. Can you send me a sample file kevin@cloudera.com
Created 06-27-2014 06:49 AM
Created 06-27-2014 06:53 AM
Bala,
It absolutely is. I was just giving you a sample set of instructions so you could play with a CSV file ingest. You will be looking to use Apache Tika. The good news is there is a morphline to help you with that. The bad new is you will have to write that morphline. I would recommend starting here: https://github.com/cloudera/search#cdk-morphlines-solr-cell
Created 06-27-2014 06:57 AM
Created 06-27-2014 07:02 AM
You can follow the same steps I sent you, but you will need to switch to https://github.com/cloudera/search#cdk-morphlines-solr-cellcdk-morphline-solr-cell morphline instead of the CSV one in the example.
Created 06-27-2014 08:03 AM
Created 07-01-2014 08:42 AM
Created 07-03-2014 07:32 AM
Kevin , I followed the steps , It working as expected in dry run. But when i run without dry--run argument . It stops at this step 😞 😞
770 [main] INFO org.apache.solr.cloud.ZkController – Write file /tmp/1404354031741-0/velocity/facet_fields.vm
771 [main] INFO org.apache.solr.cloud.ZkController – Write file /tmp/1404354031741-0/elevate.xml
773 [main] INFO org.apache.solr.cloud.ZkController – Write file /tmp/1404354031741-0/admin-extra.menu-bottom.html
774 [main] INFO org.apache.solr.cloud.ZkController – Write file /tmp/1404354031741-0/schema.xml
897 [main] INFO org.apache.solr.hadoop.MapReduceIndexerTool – Indexing 1 files using 1 real mappers into 1 reducers
It stops in 897 itself . I restarted and tried , still the same .
Any help .
Thanks
Bala
Created 05-09-2018 02:25 AM
Is incremental load to Solr is possible? Meaning that If the dataset set that is going to load in the solr has some unique keys ( with or without update in other fields of the record) that are already present in the solr collection, I want existing records get updated and new record get inserted in Solr collection. Could you please let me know if it is possible in Solr or not. If yes, please advice in achieving the same.