Member since
06-30-2016
22
Posts
3
Kudos Received
0
Solutions
05-25-2017
07:15 PM
Hi @Guilherme Braccialli, so you did not run into this issue? https://issues.apache.org/jira/browse/SPARK-20033 Thank you.
... View more
02-01-2017
01:56 AM
Hi @Artem Ervits, yes looks like this is what I need, and is perhaps the only way for my use case. Thank you for your help.
... View more
01-31-2017
01:24 AM
Hi @Artem Ervits, yes I checked that too, but I need an Object array which seems to be the only case not covered... Jackson: ObjectMapper mapper = new ObjectMapper(); return mapper.readValue(json, Object[].class); Error: Can not deserialize instance of java.lang.Object[] out of START_OBJECT token at [Source: {"f1":1,"f2":"abc"}; line: 1, column: 1] GSON: Gson gson = new Gson(); return gson.fromJson(json, Object[].class); Error: Expected BEGIN_ARRAY but was BEGIN_OBJECT at line 1 column 2
... View more
01-31-2017
01:06 AM
I have a JSON string (of type object) as follows: {"field1":1,"field2":"abc"} and I would like to convert it to a Java Object[2], where the 1st element is new Integer(1) and the 2nd element is new String("abc"). I tried both Jackson and GSON but couldn't find a way to do this conversion. Any help would be appreciated, thank you.
... View more
Labels:
- Labels:
-
Apache Hive
07-21-2016
04:18 AM
Hi @Sunile Manjee, thanks for the reply. I can try this but my understanding was that mapred.output.committer.class is for old API and mapreduce.use.directfileoutputcommitter is for new API, and it's an either-or. So would it really help to use both?
... View more
07-21-2016
12:25 AM
Could someone confirm the correct practice for setting the mapred.output.committer.class property? I was able to get it to work only if I set the property in my OutputFormat class. I tried 1) setting it in my SerDe class 2) putting it in mapred-site.xml but in both cases there seems to be no effect. Thank you.
... View more
Labels:
- Labels:
-
Apache Hadoop
07-14-2016
09:47 PM
@Benjamin Leonhardi actually I missed something, it's still called when using tez. Never mind the question.
... View more
07-14-2016
01:01 AM
Hi @Benjamin Leonhardi, I notice that if hive.execution.engine = tez, then SerDe.initialize() is NOT called from the mappers at all (i.e. it goes directly to deserilize()), which is causing me problems. Did you know whether this is expected and what the reasoning is? Thank you.
... View more
07-06-2016
09:50 PM
Hi @Benjamin Leonhardi, I still have the following questions regarding this topic and would appreciate your comments: 1) So why exactly is initialize()/getObjectInspector() called many times inside mappers? They are called after getRecordReader() which seems even more confusing to me... 2) Assuming there is nothing we can do about the above behavior, do you know of a good way for my code to tell whether I'm inside the mappers or prior to that (in other words, am I calling initialize() for the first time or not?)? As mentioned in my previous comment, I'm currently relying on the jobConf object to tell me that, but I would like to get rid of this dependency if possible... Thanks.
... View more
07-06-2016
12:51 AM
Hi @Benjamin Leonhardi, thanks kindly for the response. I was able to resolve it via the JobConf object since it is persistent throughout. I save the string form of the metadata in the JobConf the first time and then only need to read from it in the mappers. Of course, I can also simply check whether this string exists in the JobConf in order to know whether I'm doing it for the first time or whether I'm in mappers. This all sounds reasonable to you? I would still like to understand your approach. By access objects did you mean the OI? As far as I could tell class objects will always be null when called from mappers? BTW I have another critical question and I would appreciate your comments there as well -- https://community.hortonworks.com/questions/43603/hive-storage-handlers-control-thread.html.
... View more