Member since
01-30-2014
25
Posts
0
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2405 | 08-27-2014 11:57 AM | |
4070 | 08-21-2014 05:31 AM |
06-24-2015
03:31 PM
To be complete, yes you need to use the safety valves to get the correct order of the coprocessors. You also need to set the HFile version to 3, else Hbase won't start with these coprocessors. I find this last one odd, because Hbase 1.0 should use 3 by default, as per the docs. Anyway, use the hbase documentation sample config as a sample of which setting you need where. http://archive.cloudera.com/cdh5/cdh/5/hbase-1.0.0-cdh5.4.0/book.html#security.example.config
... View more
05-20-2015
02:19 PM
Instaling CDH5.4 with kerberos security gives me the opportunity to make grants to namespaces etc, but I want to enable visibility labels as well, which seem to be disabled by default. Cloudera documentation only tells me this feature is experimental, but not how to enable it. Apache Book shows to add the proper coprocessors, but it also mentions the proper order of the coprocessors. As from: http://archive.cloudera.com/cdh5/cdh/5/hbase-1.0.0-cdh5.4.0/book.html#security.example.config ..I tried adding "org.apache.hadoop.hbase.security.visibility.VisibilityController" via the cloudera manager, but when reviewing the config changes, I see that the order is not correct, it's adding the Visibility Label in from of the (apparantly default AccessControler and TokenProvider, which is the incorrect order. Any other way to enable this feature or to maintain the proper order?
... View more
Labels:
- Labels:
-
Apache HBase
09-02-2014
10:48 AM
To be clear, you wil also only get the column families, not the columns within those. You didn't define those at create-time either, but just to be complete 🙂
... View more
08-27-2014
11:57 AM
To answer my own question: No this is not possible at this time, since Solr only started supporting nested documents since 4.5 and CDH5.1 is at 4.4 right now. Even if this becomes available in a future release the question will be whether or not this can be easily integrated and used with Kites morphlines. For getting the job done I had to switch to using ElasticSearch, which does support nested documents and used Flume's ElasticSearchSink. Flume's official documetation on elasticsearch and avro is lacking and I had to patch flume code to get it working with UTF-8 charset and Json, but it's working nonetheless. Hope I can move this dataflow to the better integrated SolrCloud in the future.
... View more
08-21-2014
05:45 AM
I have an avro input source, going through a morphline into Solr. For example the following structure: { "username" : "alex" "date" : "21-08-2014" "attachments" : [ "documents" : [ { "title": "test" "tags" : [ "a", "b", "c" ] }, { "optional1" : "test2" "title" : "test2" } ], "source" : "school" ] } I can extract with extractAvroPath, like so: ... { extractAvroPaths { flatten : true paths : { /my_user : /username # this works fine /my_attachments : "/attachments[]" /my_documents : "/attachments[]/documents[]" } } } ..... The problem being that /my_attachments or /my_documents now contain raw json/avro structures instead of a single field. How would I go about 'unwrapping' these fields so that they are all part of one solr document, while still retaining their context of the document they belong to?
... View more
Labels:
- Labels:
-
Apache Solr
08-21-2014
05:31 AM
Srry too quick on the post trigger: this was just a matter of quoting the mapping: /my_attachments : "/attachments[]" The question remains how to map the structure to a solr index, but will post that in the appropriate section.
... View more
08-21-2014
04:35 AM
I'm having trouble extracting a nested structure from my avro data. _attachment_body=[ { "username" : "alex" "date" : "21-08-2014" "attachments" : [ "documents" : [ { "title": "test" "tags" : [ "a", "b", "c" ] }, { "optional1" : "test2" "title" : "test2" } ], "context" : "school" ] } Extracting the paths with Avro: ... { extractAvroPaths { flatten : true paths : { /my_user : /username # this works fine # all three of these result in the same error message, with different patemeters /my_attachments : /attachments[] /my_documents : /attachments[]/documents[] /my_contexts : /attachments[]/documents[]/context } } } Results in the following error message: com.typesafe.config.ConfigExceptionWrongType: morph-solr.conf: 30: Cannot concatenate object or list with a non-object-or-list, ConfigString("/my_attachments") and SimpleConfigList([]) are not compatible. Eventualy I would like to map the fields to a solr index. So if its possible to extract the nested structures, the followup question would be how to map those to a solr schema, but lets take it one step at a time 🙂
... View more
Labels:
- Labels:
-
Apache Solr
04-17-2014
12:05 PM
I'm wondering if there ever is a reason for solr to be in the root of a zookeeper install. Shouldn't it always be in some path inside '/'? In that case --zk being '/' would indicate a problem, either in configuration or in the user making a mistake, something you could alert on or even refuse to run. Adding the prompt on --force would be a great step and I see the use of the -y option.
... View more
04-16-2014
10:35 AM
Yes we manage the cluster with CM. Reading your reply I'm now sure the new edge node we added did not get a 'deploy client config' so was missing the proper settings. Not knowing this at the time, the solrctl did not work as expected(without the proper client configs) I remember manually adding them to the solrctl command, most likely without the required /solr root, resulting in the wipe of zookeeper /. Thanks for clearing this up. Still for a solr CLI tool to default back to '/' of the entire quorum, without any notice and clearing it with a --force is pretty scarry and not what you expect as an end user of a solr specific tool. Thanks for filing the reports, Rob
... View more
04-03-2014
12:00 PM
I had an 'interesting' experience setting up cloudera search as an addition to a not to shabby hbase cluster. Problems started when I created a collection with a trailing '/ ' , which is not allowed apparently. In hindsight I now know that this created a item in the overseer queue, which could not be processed, blocking all further requests. Showing up in the logs as the overseer being in a loop. When I did not know this I tried a 'solrctl init', which did not work. After reading the warnings that this could mess up any previous solr state, which we didn't have, i continued using "solrctl init --force". I was a little surprised to see that the entire /hbase entry in zookeeper was wiped clean and all of hbase being in a state of panic, losing it's entire administration. Revering back to zookeeper snapshots got my hbase back up and running, but I'm still baffled on: 1. How could this have happened? 2. If this is even a remote possibility of this command, I would recommend adding some extra red flags around the documentation recommending this option. I'm running CDH4.5 with solr 1.1.
... View more
Labels: