Created 12-10-2015 02:05 PM
I tried to index the files in a folder on HDFS; my solr configuration is the following:
./solr start -cloud -s ../server/solr -p 8983 -z -Dsolr.directoryFactory=HdfsDirectoryFactory -Dsolr.lock.type=hdfs -Dsolr.updatelog=hdfs://
when I launch:
hadoop jar /opt/lucidworks-hdpsearch/job/lucidworks-hadoop-job-2.0.3.jar com.lucidworks.hadoop.ingest.IngestJob -Dlww.commit.on.close=true -cls com.lucidworks.hadoop.ingest.DirectoryIngestMapper -c Collezione -i /user/solr/documents -of -zk
I get the following error:
Solr server not available on: <a href=""></a> Make sure that collection [Collezione] exists
The collection exists and is valid, but it looks like it is not able to contact the server.
I'd really appreciate some help in solving this problem.
Created 12-13-2015 10:00 PM
@Davide Isoardi I was able to fix your problem, please try the following solution:
1)Create jaas-file, called jaas.conf
This file can be empty, doesnt really matter since your env. is not kerberized.
2) Start your Job with the following command
hadoop jar /opt/lucidworks-hdpsearch/job/lucidworks-hadoop-job-2.0.3.jar com.lucidworks.hadoop.ingest.IngestJob -Dlww.commit.on.close=true -Dlww.jaas.file=jaas.conf -cls com.lucidworks.hadoop.ingest.DirectoryIngestMapper --collection test -i file:///data/* -of --zkConnect,,
The order of the parameters needs to be the same as in the above command, otherwise the job might not work.
I believe this is a bug, could you please report this issue to Lucidworks? Thanks.
Created 12-10-2015 02:34 PM
Is your cluster kerberized?
I have seen this error a couple days ago and there was an important piece missing in the Solr documentation until now.
Your launch command should look similar to this:
hadoop jar /opt/lucidworks-hdpsearch/job/lucidworks-hadoop-job-2.0.3.jar com.lucidworks.hadoop.ingest.IngestJob -Dlww.commit.on.close=true -Dlww.jaas.file=/opt/lucidworks-hdpsearch/solr/bin/jaas.conf -cls com.lucidworks.hadoop.ingest.DirectoryIngestMapper --collection MyCollection -i hdfs://hortoncluster/data/* -of --zkConnect,,
Make sure you include the Jaas option in a kerberized enviornment: -Dlww.jaas.file=/opt/lucidworks-hdpsearch/solr/bin/jaas.conf
Created 12-10-2015 04:02 PM
no, my cluster is not kerberized.
Created 12-10-2015 02:40 PM
Hi Davide,
When indexing to solr cloud the zk list should contain all zookeeper instances + the zookeeper ensemble root directory if it was defined. I see in you call you have zk, Can you please respond if you have the root directory for ZK ensemble defined as solr? If not remove /solr with only the host being set and try indexing like so -zk
Created 12-11-2015 01:26 PM
my zookeeper paths are:
[zk: localhost:2181(CONNECTED) 5] ls / [configs, zookeeper, clusterstate.json, aliases.json, live_nodes, rmstore, overseer, overseer_elect, collections] [zk: localhost:2181(CONNECTED) 6] ls /configs/mycollection [currency.xml, protwords.txt, synonyms.txt, _rest_managed.json, solrconfig.xml, lang, stopwords.txt, schema.xml] [zk: localhost:2181(CONNECTED) 7] ls /collections/mycollection [state.json, leader_elect, leaders]
I created the collection whit run:
./solr create -c mycollection -d ../server/solr/configsets/basic_configs/
the command for indexing is almost different from previous:
yarn jar /opt/lucidworks-hdpsearch/job/lucidworks-hadoop-job-2.0.3.jar com.lucidworks.hadoop.ingest.IngestJob -Dlww.commit.on.close=true -cls com.lucidworks.hadoop.ingest.DirectoryIngestMapper -c mycollection -i /user/solr/documents -of -zk
Created 12-11-2015 05:55 PM
did that work?
Created 12-12-2015 11:19 AM
solrCloud work but the command for indexing files in hdfs folder return:
Solr server not available on: <a href=""></a> Make sure that collection [mycollection] exists
Created 12-13-2015 10:00 PM
@Davide Isoardi I was able to fix your problem, please try the following solution:
1)Create jaas-file, called jaas.conf
This file can be empty, doesnt really matter since your env. is not kerberized.
2) Start your Job with the following command
hadoop jar /opt/lucidworks-hdpsearch/job/lucidworks-hadoop-job-2.0.3.jar com.lucidworks.hadoop.ingest.IngestJob -Dlww.commit.on.close=true -Dlww.jaas.file=jaas.conf -cls com.lucidworks.hadoop.ingest.DirectoryIngestMapper --collection test -i file:///data/* -of --zkConnect,,
The order of the parameters needs to be the same as in the above command, otherwise the job might not work.
I believe this is a bug, could you please report this issue to Lucidworks? Thanks.
Created 12-23-2015 07:40 AM
@Davide Isoardi were you able to test the above solution?
Created 02-05-2016 08:18 PM
@Davide Isoardi are you still having issues with this? Can you accept best answer or provide your own solution?