About james_jones

james_jones · ‎05-02-2017

If you mean Solr on HDFS, the answer is "it depends." If you have a high number of frequent updates to your index, I usually recommend local storage. On the other hand, if your updates are more batch, and not a constant stream, then using HDFS is a convenient option. If you mean installing Solr on HDF, the only supported option and use case is installing Ambari Infra. The Ambari Infra component is Solr under the covers, but it is only supported for use with HDP and HDF components such as Ranger for User Audit records. There's no support to use Ambari Infra for indexing your own data.

james_jones · ‎02-24-2017

@Tony Bolt It seems like the solr.hdfs.confdir is not being used. It's unlikely to be the problem, have you checked permissions?

james_jones · ‎02-03-2017

@mqureshi made great points. Also, note that you do not have to store any fields in Solr. You can choose True for either or both: stored=true/false, indexed=true/false. Of course if stored=false, you won't see the value in results but you will at a minimum, see the "uniqueKey" which would be your "id" field. You could also look at the HBase Indexer: https://community.hortonworks.com/articles/1181/hbase-indexing-to-solr-with-hdp-search-in-hdp-%2023.html

james_jones · ‎01-24-2017

I have a solution.... We had this issue, caused by this: https://issues.apache.org/jira/browse/RANGER-1249 To fix it, create a file: /etc/ranger/admin/ranger-admin-env-tz.sh Add this: JAVA_OPTS="$JAVA_OPTS -Duser.timezone=UTC" Now chmod +x /etc/ranger/admin/ranger-admin-env-tz.sh Now restart ranger admin. I know this was asked long ago, but if this solves your problem (which it did for me), please accept the answer.

james_jones · ‎12-14-2016

Hi Maxime, Did you start ambari-agent on each node? Ssh to each node as root (probably) or ambari (depending on how you set it up) and run ambari-agent status and/or ambari-agent-start. I'm pretty sure you will see an error if there is no communication with the agents. If an operation failed, you will see a red line. As you drill in, you will probably see a number of green lines, but you need to find the failed ones which are red and drill into them.

james_jones · ‎12-14-2016

@Maxime Savary - the logs to start each service can be found by looking in Ambari "Ops" button near the top by the "Alerts" button. Drill down to the failed (red) action and eventually you will get to the logs. there are two logs for each operation in this window - stderr and stdout. You don't have to go to the file system. These are the ambari-agent logs for specific operations. The other half of the equation is to look on the server where the service is starting. For example, look at /var/log/hadoop/hdfs/hadoop-hdfs-datanode.log or hadoop-hdfs-namenode.log. These files are going to be on whatever server the service is supposed to run on. Please click Reply rather than entering a new answer when you are replying.

james_jones · ‎12-14-2016

@Maxime Savary - Once ambari-server is up, and ambari-agent is up on each node, in Ambari s you can stop and start the cluster using the Action button below the list of services. https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.0.0/bk_Ambari_Users_Guide/content/_starting_and_stopping_all_services.html If this does not work, you need to look at the logs. You can post logs here and we will try to help.

james_jones · ‎12-14-2016

Did you restart all services? If so, you'll need to look at the logs to see what's going on. Start by looking at the logs in Ambari from the start commands for each service. If that doesn't give answers, you may need to look at the logs on one of the data nodes.

james_jones · ‎12-12-2016

What is your environment? Are you using the HDP Sandbox in VirtualBox?

james_jones · ‎12-12-2016

@Avijeet Dash - Terry made all good points. Note that using SolrCloud does not require using HDFS. SolrCloud can also use local storage and it is not uncommon. Sometimes people misunderstand when we don't point this out. The optimal choice of HDFS vs local depends on the use case, but local storage is usually preferred over HDFS if your index has a high level of updates/adds. SolrCloud automatically replicates your data and is fault tolerant, but still, SolrCloud has the advantages Terry mentioned.

Online	Offline
Last Visited	‎11-22-2024 01:45 PM

Member Since	‎01-18-2016 02:01 PM
Last Visited	‎11-22-2024 01:45 PM
Posts	163
Kudos received	31

Cloudera Community

Re: Ambari SPN creation on remote AD

Re: Solr on HDF

Re: Wrong timezone in Ranger admin

Re: SOLR server connection refused

Re: Solr Configuration - Error uploading file

Re: Solr on HDF

Re: SOLR V6.4.1 Unable to Create collection when u...

Re: SOLR - how to use it

Re: Wrong timezone in Ranger admin

Re: Ambari UI display datanode status as Stopped

Re: Ambari UI display datanode status as Stopped

Re: Ambari UI display datanode status as Stopped

Re: Ambari UI display datanode status as Stopped

Re: Ambari Infra Solr UI & Ranger UI having Proble...

Re: HDP Search