Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

cloudera solr integrating with apache nutch 1.7 custom built.

avatar
Contributor

Hi,

 

I'm new to solr and nutch and i'm trying to integrate cloudera solr with apache nutch 1.7 custom built by taking source and adding mapred-site.xml,core-site.xml,hadoop-env.sh,hdfs-site.xml,yarn-site.xml.

 

As such normal crawling works for apache nutch. But when i try to integrate and crawl and index in solr provided by cloudera, it's failing with below exceptions. Since i'm very new to this, i'm unable to figure out how to solve this issue. Kindly can any one tell me how to proceed.

 

request: http://bigdata-cl1-nn:8983/solr/update?wt=javabin&version=2
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:155)
at org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:118)
at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:44)
at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:502)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:456)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

 

 

PS: What i did was copied schema-solr4.xml to 

/usr/share/doc/solr-doc-4.3.0+61/example/solr/collection1/conf and added  in 351 line: <field name="_version_" type="long" indexed="true" stored="true"/>

and restarted solr.

 

CDH versions==> 5.1.0-1.cdh5.1.0.p0.53

 

I tried to find the Location, i couldn't find solr, hence posting here, please redirect me if this is not the correct group.

 

Thanks for the suggestions. 

 

Thanks and Regards,

Sandeep B A

 

1 ACCEPTED SOLUTION

avatar
Contributor

Hi All,

I got this working by changing from nutch version 1.7 to 1.8.

Reason==> 

This was the issue. Further details please follow up the link:

 

http://www.mail-archive.com/user%40nutch.apache.org/msg12592.html

 

Thanks and Regards,

Sandeep B A

View solution in original post

1 REPLY 1

avatar
Contributor

Hi All,

I got this working by changing from nutch version 1.7 to 1.8.

Reason==> 

This was the issue. Further details please follow up the link:

 

http://www.mail-archive.com/user%40nutch.apache.org/msg12592.html

 

Thanks and Regards,

Sandeep B A