Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Super Collaborator

Adding TTL on Solr:

cd to this directory

Step1:

13508-picture1.pngStep2:

13510-picture2.png

Step3:

vi managed-schema: add these 3 lines

<field name="_timestamp_" type="date" indexed="true" stored="true" multiValued="false" />

<field name="_ttl_" type="string" indexed="true" multiValued="false" stored="true" />

<field name="_expire_at_" type="date" multiValued="false" indexed="true" stored="true" />

Step4:

vi solrconfig.xml

Replace the below 3 lines with the lines after it:

<updateRequestProcessorChain name="add-unknown-fields-to-the-schema">

<!-- UUIDUpdateProcessorFactory will generate an id if none is present in the incoming document -->

<processor class="solr.UUIDUpdateProcessorFactory"/>

as

<updateRequestProcessorChain name="add-unknown-fields-to-the-schema">

<processor class="solr.TimestampUpdateProcessorFactory">

<str name="fieldName">_timestamp_</str>

</processor>

<processor class="solr.DefaultValueUpdateProcessorFactory">

<str name="fieldName">_ttl_</str>

<str name="value">+30SECONDS</str>

</processor>

<processor class="solr.processor.DocExpirationUpdateProcessorFactory">

<str name="ttlFieldName">_ttl_</str>

<str name="ttlParamName">_ttl_</str>

<int name="autoDeletePeriodSeconds">30</int>

<str name="expirationFieldName">_expire_at_</str>

</processor>

<processor class="solr.FirstFieldValueUpdateProcessorFactory">

<str name="fieldName">_expire_at_</str>

</processor>

<!-- UUIDUpdateProcessorFactory will generate an id if none is present in the incoming document -->

<processor class="solr.UUIDUpdateProcessorFactory" />

Things that might be useful:

Make sure to start solr like this so that configs related to solr goes to /solr in zookeeper:

1./opt/lucidworks-hdpsearch/solr/bin/solr start -c –z lake1.field.hortonworks.com:2181, lake2.field.hortonworks.com:2181, lake3.field.hortonworks.com:2181/solr

2.create the collection like this /opt/lucidworks-hdpsearch/solr/bin/solr create -c tweets -d data_driven_schema_configs -s 1 -rf 1

3.to delete the collection:

http://testdemo.field.hortonworks.com:8983/solr/admin/collections?action=DELETE&name=tweets

4.also remove it from zkCli.sh as rmr /solr/config/tweets

Thanks,

Sujitha Sanku

please ping me or email me at ssanku@hortonworks.com

in case of any issues.


picture2.png
7,057 Views
Comments
avatar
Rising Star

Thanks for your article on setting TTL for Solr documents. However, in my environment, I have Ambari Infra-solr auto created cores for hadoop logs that are taking up disk space.

I followed the above and updated the managed-schema and solrconfig.xml under

/usr/lib/ambari-infra-solr/server/solr/configsets/data_driven_schema_configs/

I used Ambari Dashboard to restart Ambari Infra Solr and Zookeeper services instead of manually starting Solr using your above command.

How would we know if Zookeeper and Solr picked up these settings.

Thanks

Anitha

avatar
Rising Star

This worked. Thanks for the detailed steps.