About mathieu.d

mathieu.d · ‎04-11-2017

Severals idea : - Not sure it is mandatory to specify the path of the root znode where solr is installed on each host. I think you should supply it only at the end of the quorum - "host1:port,host2:port,host3:port/solr". (I think this is your issue) - Does Solr is installed inside the /solr znode ? - What is the size of the current "clusterstate.json" znode ? (does it exists ?) - Does your current "clusterstate.json" content is valid ? - Does zookeeper is hosted on the 3 nodes you have specified ? (can you ping these hosts ?) Good luck !

mathieu.d · ‎04-07-2017

At least one of the containers was killed by your application master because it requested too much memory. You need to identify if it was a mapper or a reducer. And then, you will need to tweak yarn configuration a little to increase the memory available for mapper and/or reducer. 1Go is rather "small" in my opinion. regards, mathieu

mathieu.d · ‎04-07-2017

Ok, I could reproduce the issue quickly. Two things : - 1 : the MapreduceIndexerTool generates metadata about the file. These "fields" are presented to Solr https://www.cloudera.com/documentation/enterprise/5-6-x/topics/search_metadata.html So if you need these fields, you need to add them to the schema of the collection. - 2 : if you don't need these fields, then you have to delete them before presenting the row to Solr. For this, the function "sanitizeUnknownSolrFields" if the right way to go. But you missplaced it in your morhpline. It should be inside the command [] and after the readCSV function (of course before the loadSolr function). Something like this : commands [ { readCSV { ... } } { sanitizeUnknowSolrFields { ... } } { loadSolr { ... } } ]

mathieu.d · ‎04-07-2017

Could you share the following things (if you can) ? - collection name and the schema.xml of the collection - indexer configuration used (indexer_def.xml ?) - morphline configuration used - the command line used for launching the batch indexation - a small csv sample ? This might enable me to pin-point the issue (or tell you that all seems fine for me). regards, mathieu

mathieu.d · ‎04-07-2017

Does HiveServer2 is up and running ? If yes, does it run on port 10000 ? (per default it should). This is the service beeline try to connect to.

mathieu.d · ‎04-07-2017

Does HBase works ? Because with what you said (ERROR: The node /hbase is not in ZooKeeper.) I have the feeling that your HBase cluster is not working properly. You should fix that first.

mathieu.d · ‎04-07-2017

The error tells that your are trying to push a field named "file_length" into Solr. And Solr doesn't know about that field. Either you made an error in the schema of the collection and you need to fix the field name ? Either this field really not exist into Solr and you should not push it (for that you can use the sanitizeUnknownSolrFields function of morphline if you are using morphline).

mathieu.d · ‎03-24-2017

I do think this is a defect. Not sure how Cloudera will see it. But to be fair, this particular way of inserting data (with the VALUES syntax) into a table is pretty much limited to small testing.

mathieu.d · ‎03-23-2017

Well, I personaly tends to think this is a small "overlooked" use case. I mean, this particular query syntax is doing some "weird" things in Hive under the hood (it creates a table and reads it) and sentry seems to not be expecting it.

mathieu.d · ‎03-23-2017

I did encounter such a situation from time to time in the past (not recently). Each time, it was because at least one of the Solr server was busy doing something else. Check that each Solr server has good health and that there is no defect on the health of the other collections.

Online	Offline
Last Visited	‎01-17-2018 02:52 AM

Member Since	‎07-16-2015 01:41 AM
Last Visited	‎01-17-2018 02:52 AM
Posts	177
Kudos received	28

Cloudera Community

Re: Unable to delete HDFS Corrupt files

Re: Hive partitions based on date from timestamp

Re: Partition Hive Table to Hbase Handler ?

Re: yarn logs location on disk

Re: Increase Flume graceful restart time

Re: Receving Zookeeper Exception when using SolrJ ...

Re: org.apache.solr.common.SolrException: ERROR: ...

Re: org.apache.solr.common.SolrException: ERROR: ...

Re: org.apache.solr.common.SolrException: ERROR: ...

Re: Beeline No Current connection[SOLVED FOR ME] t...

Re: x sRe: Issue: Create HBASE table on Hive

Re: org.apache.solr.common.SolrException: ERROR: ...

Re: INSERT only permission not working : "does not...

Re: INSERT only permission not working : "does not...

Re: SOLR: API create collection times out periodic...