Member since
12-14-2016
14
Posts
14
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1389 | 09-07-2018 06:45 PM | |
1437 | 10-05-2017 06:35 PM | |
272 | 06-19-2017 06:35 PM | |
957 | 02-28-2017 03:39 PM | |
6110 | 01-10-2017 07:38 PM |
09-07-2018
06:45 PM
1 Kudo
It's coming soon (goal is by the end of October). There will be a new HDP Search package that has Solr 7.4 and connectors for the new versions of HDFS and Hive that are in HDP 3.
... View more
03-09-2018
07:30 PM
Banana can only take a single collection per dashboard, it doesn't support multiple collections in one dashboard.
... View more
10-06-2017
09:04 PM
1 Kudo
@tonybolt, Solr 7.0.1 was just released this afternoon.
... View more
10-05-2017
06:35 PM
1 Kudo
This is very likely https://issues.apache.org/jira/browse/SOLR-11406, which has the net effect that Solr 7.0 cannot read indexes from any Solr 6.x version. To resolve this, a new release of Solr is in the works (7.0.1). The vote for this release is just nearing the end, so I would expect it to be released in the next 1-3 days.
... View more
09-21-2017
07:17 PM
1 Kudo
The RandomSortField type doesn't hold any data, so you don't need to define any column in your database to go into it. When you query your data, you'll add "sort=random_<seed>" to the query parameters, where you define some value (whatever you want) for the <seed>. If you issue the same query later with the same seed value, you should get the same results (assuming you haven't indexed new data).
Note the field type definition in the example is not complete. It should be: <fieldType name="random" class="solr.RandomSortField /> You can of course give the field type any name you want, but would then need to modify your dynamic field rule to reference the proper field type name.
... View more
07-10-2017
06:58 PM
That means Solr thinks it's main configuration file, solrconfig.xml, has two sections defining the directory factory, which controls how the indexes are stored. For storing indexes in HDFS, you need to change the directoryFactory configuration to use the proper directoryFactory (HdfsDirectoryFactory). I think Ambari will automatically configure this - is it possible you are also manually modifying the file? Or, is it possible you manually modified it but didn't remove the original directoryFactory configuration? Since in your example you are using the data_driven_schema_configs, you should look in ./server/solr/configsets/data_driven_schema_configs/conf/solrconfig.xml and see if you have two sections there.
... View more
06-19-2017
06:35 PM
1 Kudo
It's only documenting the fact that Solr has updated it's embedded ZooKeeper to 3.4.10. I don't know of any changes that would impact Solr 6.6 working with ZK 3.4.6.
... View more
04-14-2017
06:34 PM
1 Kudo
Just to add to the previous reply, the version of Solr in HDP Search is *exactly* the same as the same version downloaded direct from the Apache Solr project. Besides Solr, the benefits of the package are the connectors for indexing from HDFS, Hive, Pig, HBase, plus Storm and Spark development kits, and the Ambari integration.
... View more
03-07-2017
06:57 PM
2 Kudos
Solr also supports TTL, although I think if the docs are deleted from HBase they should be deleted in Solr automatically. In case you are interested in the Solr TTL feature, it's done through an UpdateRequestProcessor (URP). It's currently only documented in Solr's Javadocs: http://lucene.apache.org/solr/6_4_0/solr-core/org/apache/solr/update/processor/DocExpirationUpdateProcessorFactory.html (replace the '6_4_0' part of that URL to get to the javadocs for your version; this URP has existed since Solr 4.8.0).
... View more
03-06-2017
06:37 PM
I believe this is a known issue with .zip archives and the Solr ExtractingRequestHandler (aka Solr Cell): https://issues.apache.org/jira/browse/SOLR-2416. The short version of the story is that Tika in this case is not configured to parse the .zip recursively. One of the other suggestions for NiFi processing may be worth exploring in this case.
... View more
03-01-2017
01:51 PM
1 Kudo
Great, thanks @Tony Bolt. I was able to trace the cause of the problem to a seemingly unrelated commit that occurred for the 6.4.0 release, and the good news is the fix has already been committed for an upcoming 6.4.2 release. The release process for that has already started, and we'd expect it to be out within 1-2 weeks. There is no patch to apply, but if you have the ability to build Solr from source, you could try to build locally with "branch_6_4", which is where 6.4.2 will come from, or "branch_6x", which also contains the same fix. If you can't do a local build for any reason, we do already have a 2nd confirmation that the problem is fixed with this upcoming release, so it's certainly not required or expected of you to test it at this point.
... View more
02-28-2017
03:39 PM
2 Kudos
I have duplicated this problem, and filed an issue in the Solr community: https://issues.apache.org/jira/browse/SOLR-10215. I don't know what is causing it, but it seems limited to Solr 6.4. I tried the same setup with Solr 6.3.0 and it worked fine. If you don't mind, I'd like to post a comment to that issue with a link to this forum thread to show that others have had the same problem.
... View more
01-11-2017
02:42 PM
When speaking about scaling capabilities of Solr, there are really only two modes: 1. Master/slave, which is one or more Solr servers running in standalone mode with what the community calls "legacy scaling" features enabled. 2. SolrCloud, which is one or more Solr servers running with ZooKeeper coordinating activity between them. Solr doesn't make the two really obvious, meaning there's no single switch to turn on or off to choose which mode. Both are a set of configuration options that are different depending on the mode you want to use. It's important to note master/slave is not really different from standalone mode because you can work for years with Solr running on a single server and then add index replication (and/or distributed queries) if you need it. It's harder to move from standalone to SolrCloud (although not impossible). Some documentation from the Solr Reference Guide might help: Master/Slave: https://cwiki.apache.org/confluence/display/solr/Legacy+Scaling+and+Distribution SolrCloud: https://cwiki.apache.org/confluence/display/solr/SolrCloud
... View more
01-10-2017
07:38 PM
3 Kudos
Good question, Avijeet, but I think there is a little fundamental confusion to start. Solr and SolrCloud are not separate things; Solr is the application while SolrCloud is a mode of running Solr. The alternative to running Solr in SolrCloud mode is running it in standalone mode. SolrCloud mode offers index replication, failover, load balancing, and distributed queries with the help of ZooKeeper and other specialized features in Solr. In standalone mode, Solr still offers index replication and distributed queries in a master/slave model, but these activities are not coordinated with ZooKeeper but are managed manually. Failover and load balancing also need to be configured and managed entirely outside Solr with 3rd party tools. When using Solr in SolrCloud mode, every index update is distributed across the cluster to every shard and replica of the cluster. For some use cases, such as particularly high indexing, this is too heavyweight, and standalone mode is preferred. Others simply prefer to separate nodes used for indexing from nodes used for queries, which is only possible today with standalone mode. Still others started with Solr before SolrCloud was introduced and have not yet found a compelling reason to change. Regarding the question about running Solr clusters separately, since a SolrCloud cluster is a Solr cluster, the recommendation would apply. Regarding the last question, "SolrCloud works with HDFS and not Solr, if we run it in separate cluster then is there any use of using HDFS?", would you restate this question? Solr can store indexes in HDFS in both available modes of operation, which is perhaps the answer you're looking for. It's worth noting, though, that Solr has it's own model for replicating the indexes, and it does not "hand off", as it were, this functionality to HDFS. Even if you store your indexes in HDFS (in either mode), you still need to consider your Solr-based replication strategy as you cannot rely on HDFS to handle it for you. Hope this helps.
... View more