Member since
01-18-2015
5
Posts
0
Kudos Received
0
Solutions
02-09-2016
11:45 AM
I have used the combination in cases where the data model was changing over time and where it was complex.Its pretty easy to create an avro schema and the java bindings..There are cases where avro is a best fit over parquet.In case you are not sure it may be worthwhile to start with avro,do performace analysize and you can always change to parquet very easily. Nishan
... View more
06-12-2015
05:54 AM
Try to use the sanitizeUnkownSolrFields command per http://kitesdk.org/docs/current/morphlines/morphlines-reference-guide.html#sanitizeUnknownSolrFields Wolfgang.
... View more
05-11-2015
05:17 AM
I just installed CDH5.4 Sandbox and trying to access to HDFS from Java getting this error: log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-545953227-127.0.0.1-1429800393650:blk_1073742225_1401 file=/tmp/b.txt at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:888) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:56 😎 I have another VM (CDH 5.3) and it's working just incluing in the classpath the core-site.xml and hdfs-site.xml with the same code, so it seems that something is wrong in that VM (5.4) I can read "b.txt" with hadoop fs -cat /tmp/b.txt so the file is right. I have been checking the state of HDFS with hadoop fsck and dfsadmin and there're not missed blocks. I included as well the hostname/ip in the hosts file in Windows. What's it wrong?? any clue?
... View more
02-08-2015
07:35 PM
Hi ortizg, that's because the firewall is blocking the CM web access, you should add an exception to iptables or you can simply stop it by typing sudo services iptables stop.
... View more