Created on 08-10-2014 01:48 AM - edited 09-16-2022 02:04 AM
Hello!
I have a files with data, for example web servers logs.
/data/log/000001.txt
/data/log/000002.txt
/data/log/000003.txt
/data/log/000004.txt
I want to build full text search on them and get filename in the search result.
How I can do this?
Created 08-11-2014 08:35 AM
I found the solution.
When morphline process data from hdfs it appends additional fields for every record:
file_download_url=[hdfs://MYHOST:2080/testdata/log],
file_group=[nobody],
file_host=[MYHOST],
file_last_modified=[1405102390179],
file_length=[198923],
file_name=[log.txt],
file_owner=[pmezentsev],
file_path=[/testdata/log/log.txt],
file_permissions_group=[r--],
file_permissions_other=[r--],
file_permissions_stickybit=[false],
file_permissions_user=[rw-],
file_port=[8020],
file_scheme=[hdfs],
file_upload_url=[hdfs://MYHOST/testdata/log/log.txt],
so if you want to get full filename into your index, just file_path to your schema.xml
<field name="file_path" type="string" indexed="true" stored="true" />
Created 08-11-2014 08:35 AM
I found the solution.
When morphline process data from hdfs it appends additional fields for every record:
file_download_url=[hdfs://MYHOST:2080/testdata/log],
file_group=[nobody],
file_host=[MYHOST],
file_last_modified=[1405102390179],
file_length=[198923],
file_name=[log.txt],
file_owner=[pmezentsev],
file_path=[/testdata/log/log.txt],
file_permissions_group=[r--],
file_permissions_other=[r--],
file_permissions_stickybit=[false],
file_permissions_user=[rw-],
file_port=[8020],
file_scheme=[hdfs],
file_upload_url=[hdfs://MYHOST/testdata/log/log.txt],
so if you want to get full filename into your index, just file_path to your schema.xml
<field name="file_path" type="string" indexed="true" stored="true" />
Created 08-20-2014 11:20 AM