Member since
11-30-2015
39
Posts
23
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1086 | 03-11-2016 07:31 PM | |
6013 | 12-17-2015 12:33 AM | |
1285 | 12-16-2015 10:46 PM |
01-17-2016
04:04 PM
Thanks @Gangadhar, the schema changes should have been compatible in the way you describe, they are just in an HDFS file not in a schema literal. I'll try to recreate this problem with a simple change.
... View more
01-15-2016
09:33 PM
1 Kudo
Avro, in general, supports the idea of evolving schemas and I'm trying to support that with an external Hive table. In other words, declare an external hive table and define its schema through the table properties to be read from an HDFS file: TBLPROPERTIES ('avro.schema.url'='hdfs://namenode/common/schemas/schema_v1.avsc') This works fine when all of the files in the directory for the external table are create with schema version 1. However, if i add avro files of version 2 and update the tblproperties accordingly the table becomes unusable. I see errors like this on a select count(*) statement Caused by: org.apache.avro.AvroTypeException: Found com.target.category_data, expecting com.target.category_data
at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:231)
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
at org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:127)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:176)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
at org.apache.avro.file.DataFileStream.next(DataFileStream.java:220)
at org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.next(AvroGenericRecordReader.java:153)
at org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.next(AvroGenericRecordReader.java:52)
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:347) It appears that the Hive Avro serde is using the schema in tblproperties to decode the individual avro files (or more accurately the individual file splits, I supose) instead of the schema in the header of each avro file. I would like for binary avro files created with different avro schemas to be read by the same hive table with a potentially different avro schema. Could anyone suggest ways to do that? I am not opposed to making code changes on the read side (i.e. the hive serde) or the write side (I'm using Storm's avro bolt to write my files). Thanks! -Aaron
... View more
Labels:
- Labels:
-
Apache Hive
01-13-2016
10:05 PM
In addition to whatever infrastructure Ambari needs, e.g. mysql.
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Storm
12-17-2015
12:33 AM
1 Kudo
To answer my own question, the oozie user mailing list led me to this option within the java action configuration: <property>
<name>oozie.launcher.mapreduce.user.classpath.first</name>
<value>true</value>
</property>
https://mail-archives.apache.org/mod_mbox/oozie-us... See also OOZIE-2066: https://issues.apache.org/jira/browse/OOZIE-2066
... View more
12-16-2015
10:46 PM
3 Kudos
Thanks, @rich. Turns out what I was after was the well documented <java-opt> tag when defining the java action. I apparently missed it the first few times I looked at the oozie specification.
... View more
12-16-2015
05:38 PM
Thanks for all the suggestions! That did not solve the problem unfortunately. There's appears to be a lot of things still in my classpath ahead of my lib directory. System.out.println("com.fasterxml.jackson.core.JsonFactory: [" + cl.getResource("com/fasterxml/jackson/core/JsonFactory.class") + "]");
Which prints this when run as a java action, regadless of the libpath setting com.fasterxml.jackson.core.JsonFactory: [jar:file:/hadoop/yarn/local/filecache/22/mapreduce.tar.gz/hadoop/share/hadoop/common/lib/jackson-core-2.2.3.jar!/com/fasterxml/jackson/core/JsonFactory.class]
... View more
12-16-2015
02:00 PM
I do have a lib folder, it just seems to be last in classpath precedence. A way to force it to have a higher precedence would be ideal.
... View more
12-16-2015
01:50 PM
Thanks. Is there a way for an individual user to do without access to add things to sharelib?
... View more
12-15-2015
10:01 PM
1 Kudo
I want my own jar files searched first before, for example, the standard jar files included in oozie actions like "/hadoop/yarn/local/filecache/22/mapreduce.tar.gz/hadoop/share/hadoop/common/lib/"
... View more
Labels:
- Labels:
-
Apache Oozie
12-15-2015
08:39 PM
For example, "-Dlog4j.debug" or "-Dlog4j.configuration='foo'
... View more
Labels:
- Labels:
-
Apache Oozie
- « Previous
- Next »