I am playing around with the "post" command for solr to ingest/index documents into my collection, and i was trying to use the post command with HDFS without any luck. Does anyone know if this is possible? currently i have not issues pulling from a local path, but i cant figure out a way to pull from a HDFS path.
You can use standard hdfs commands to write to stdout and have bin/post read from stdin.
This blog entry is quite helpful:
hdfs dfs -cat /tmp/copyme.txt 2>&1 | /opt/lucidworks-hdpsearch/solr/bin/post -c tweets -type text/csv -d
The above sample was run on a sandbox.