Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Sporatic Live Merge Results With MapReduceBatchIndexer

Highlighted

Sporatic Live Merge Results With MapReduceBatchIndexer

We have an interesting situation with the MapReduceBatchIndexer tool where sometimes while the job finishes successfully, the indexes are not actually loaded into Solr via the live merge. The logs are too verbose to stick in this thread, but at the end the job says this. If we run it an un-determinate number of times more, then it will eventually work.

 

82774 [pool-4-thread-1] INFO org.apache.solr.hadoop.GoLive - Live merge hdfs://nameservice1/tmp/solredh_admin_user/results/part-00000 into http://mapls188.bsci.bossci.com:8983/solr
...
83073 [main] INFO org.apache.solr.hadoop.MapReduceIndexerTool - Succeeded with job: jobName: org.apache.solr.hadoop.MapReduceIndexerTool/MorphlineMapper, jobId: job_1470234528819_0230
83073 [main] INFO org.apache.solr.hadoop.MapReduceIndexerTool - Success. Done. Program took 83.07273 secs. Goodbye.


Here is a snapshot of the current run. You can see that the Solr results are in our temp location.

 

-bash-4.1$ hadoop fs -ls /tmp/solredh_admin_user/results/part-00000/data/index
Found 12 items
-rwxrwxr-x+ 3 edh_admin_user supergroup 8797979 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0.fdt
-rwxrwxr-x+ 3 edh_admin_user supergroup 1379 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0.fdx
-rwxrwxr-x+ 3 edh_admin_user supergroup 1248 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0.fnm
-rwxrwxr-x+ 3 edh_admin_user supergroup 44950 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0.nvd
-rwxrwxr-x+ 3 edh_admin_user supergroup 61 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0.nvm
-rwxrwxr-x+ 3 edh_admin_user supergroup 350 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0.si
-rwxrwxr-x+ 3 edh_admin_user supergroup 2218199 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0_Lucene41_0.doc
-rwxrwxr-x+ 3 edh_admin_user supergroup 1356123 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0_Lucene41_0.pos
-rwxrwxr-x+ 3 edh_admin_user supergroup 8914976 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0_Lucene41_0.tim
-rwxrwxr-x+ 3 edh_admin_user supergroup 113325 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0_Lucene41_0.tip
-rwxrwxr-x+ 3 edh_admin_user supergroup 53 2016-08-03 20:02 /tmp/solredh_admin_user/results/part-00000/data/index/segments_1
-rwxrwxr-x+ 3 edh_admin_user supergroup 131 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/segments_2

 

But after the job finishes successfully, the only files are segments_1, segments_2 and this funny looking lock file.

 

-bash-4.1$ hadoop fs -ls /solr/F0116/core_node1/data/index
Found 3 items
-rw-r--r-- 3 solr solr 0 2016-08-03 20:01 /solr/F0116/core_node1/data/index/HdfsDirectory@24d078d lockFactory=org.apache.solr.store.hdfs.HdfsLockFactory@3d20687c-write.lock
-rwxr-xr-x 3 solr solr 53 2016-08-03 20:01 /solr/F0116/core_node1/data/index/segments_1
-rwxr-xr-x 3 solr solr 82 2016-08-03 20:03 /solr/F0116/core_node1/data/index/segments_2

 

We have enabled both DEBUG on the MapReduceBatchIndexer and the Solr server and have compared successful runs with none-successful runs without any luck identifying why sporadically this works and doesn't work.

 

Anyone seen something like this before?