About hirschs

hirschs · ‎11-03-2016

@Timothy Spann I was finally able to get running again: - Removed references to lzo from all configurations (using Ambari) - Manually removed all RPM packages on all machines that match *lzo* Then I re-read the Ambari instructions for about the 40th time and realized where communication was breaking down. The only installation of packages I ever observed from Ambari was during initial installation and all of it was triggered from dialogs. It may be obvious to some folks that Ambari uses the presence of the code in the io.codecs list as a trigger for silent package install on restart, but it certainly wasn't to me (since NOTHING else I've encountered in the system works in that manner. All other installs have a progress indication and function test). Once I added the configuration (without having manually installed packages first), it indeed installed them itself during restart and everything worked when complete. I would strongly suggest adding a small paragraph to the lzo configuration page to explicitly and clearly explain that this process physically installs the packages with no visual indication that this is occurring.

hirschs · ‎11-02-2016

@Timothy Spann - Ambari on our system does not provide any facility to install lzo. You keep referring to this, but it isn't there. If you believe it should be, please tell me where I might find the dialog? - I followed ALL the steps you outlined above, except for Hive. I DO NOT WANT LZO COMPRESSION ON HIVE. If that's not optional, then it should be documented as such. - I did have things stopped when I installed the RPMs and updated configuration. We're in a real mess here and currently trying to find someone to help us recover. I wish your company provided per-incident support, but that doesn't seem to be the case.

hirschs · ‎11-02-2016

@Timothy Spann The cluster itself was installed through Ambari and has been running about a year. One of my users needed LZO compression enabled several days ago. You web site told me that Ambari does not install or configure LZO, so I followed the instructions as you entered them above. I added two changes to core-site.xml that were similarly documented in the HDP 2.3.2 web pages. After fixing an initial typo, we had working LZO and could explicitly invoke LzoIndexer on files in HDFS. Shortly after that I started receiving reports about Hive being broken. Originally it was complaining that it could not find the LzoCodec. I never told it to use the LzoCodec. I did not change Hive configuration. After removing the entries in core-site.xml, the Hive problems continued but it now tells me it cannot find "com" - a nonsense class name. I did restart everything that needed to be restarted - several times, in fact. The only thing amiss in the Hive logs is the same traceback the user gets on a failed query: 2016-11-02 12:57:18,278 WARN [HiveServer2-Handler-Pool: Thread-2740]: thrift.ThriftCLIService (ThriftCLIService.java:FetchResults(681)) - Error fetching results: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.RuntimeException: Error in configuring object at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:352) at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:221) at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:685) at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:454) at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:672) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:508) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:415) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1672) at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:347) ... 13 more Caused by: java.lang.RuntimeException: Error in configuring object at org.apache.hive.common.util.ReflectionUtil.setJobConf(ReflectionUtil.java:115) at org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:103) at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) at org.apache.hadoop.hive.ql.exec.FetchOperator.getInputFormatFromCache(FetchOperator.java:207) at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextSplits(FetchOperator.java:361) at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:295) at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:446) ... 17 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedMethodAccessor194.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.common.util.ReflectionUtil.setJobConf(ReflectionUtil.java:112) ... 23 more Caused by: java.lang.IllegalArgumentException: Compression codec com not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:139) at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:179) at org.apache.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45) ... 27 more Caused by: java.lang.ClassNotFoundException: Class com not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:132) ... 29 more 2016-11-02 12:57:18,281 INFO [HiveServer2-Handler-Pool: Thread-2740]: exec.ListSinkOperator (Operator.java:close(613)) - 10800 finished. closing... We are running HDP-2.3.2 on Centos 6.7. I do not know where to start troubleshooting this, particularly since it's not deterministic. Only some queries are blowing up with no obvious common denominator across them. Again, we made no changes to Hive and my users have made no changes in the way they are querying it.

hirschs · ‎11-02-2016

@Timothy Spann And that is precisely what I had done - to the letter. If there was an option for installation from Ambari, it is not evident. Where exactly is this "wizard" you refer to? Your web documentation states clearly that Ambari neither installs nor configures LZO. The proximate issue is that Hive is totally broken now - even after removal of the two changes made to core-site.xml. Why is Hive even TRYING to use LZO? I did not configure that - I did not so much as touch Hive.

hirschs · ‎11-02-2016

After following the directions here (I'm on Linux, but could not locate the page pertinent to the Linux HDP): http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2-Win/bk_HDP_Install_Win/content/LZOCompression.html All attempts at inserting into existing Hive tables (which are NOT setup for LZO compression) yield a long traceback featuring this : Caused by: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.LzoCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:139) at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:179) at org.apache.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45) ... 21 more Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.LzoCodec not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:132) ... 23 more Why on earth is Hive even trying to use LZO? Very frustrating to find this level of fragility. Any way to get LZO to coexist with a functional Hive? Update: I removed any and all mention of LZO from core-site.xml and Hive is still blowing up while search for codecs. Looks like we now have a completely hosed cluster.

hirschs · ‎10-31-2016

I was missing com.hadoop.compression.lzo.LzopCodec in the compression codecs listing... Grrr. The error message proved to be utterly misleading.

hirschs · ‎10-28-2016

I'm trying to get LZO compression to work on our HDP 2.3.2 cluster and getting nowhere. Here's what I've done: - Installed the hadooplzo and hadoop-lzo-native RPMs - Made the documented changes to add the codec and the lzo class spec to core-site.xml When I try to run a job thusly: yarn jar /usr/hdp/2.3.2.0-2950/hadoop/lib/hadoop-lzo-0.6.0.2.3.2.0-2950.jar com.hadoop.compression.lzo.LzoIndexer /path/to/lzofiles It tells me: [hirschs@sees24-lin ~]$ yarn jar /usr/hdp/2.3.2.0-2950/hadoop/lib/hadoop-lzo-0.6.0.2.3.2.0-2950.jar com.hadoop.compression.lzo.LzoIndexer /xxxx/yyy 16/10/28 16:44:56 ERROR lzo.GPLNativeCodeLoader: Could not load native gpl library java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1886) at java.lang.Runtime.loadLibrary0(Runtime.java:849) at java.lang.System.loadLibrary(System.java:1088) at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:32) at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:71) at com.hadoop.compression.lzo.LzoIndexer.<init>(LzoIndexer.java:36) at com.hadoop.compression.lzo.LzoIndexer.main(LzoIndexer.java:134) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 16/10/28 16:44:56 ERROR lzo.LzoCodec: Cannot load native-lzo without native-hadoop 16/10/28 16:44:57 INFO lzo.LzoIndexer: LZO Indexing directory /xxxxx/yyyyy... 16/10/28 16:44:57 INFO lzo.LzoIndexer: [INDEX] LZO Indexing file hdfs://correct_path_to_file, size 1.08 GB... 16/10/28 16:44:57 INFO compress.LzoCodec: Bridging org.apache.hadoop.io.compress.LzoCodec to com.hadoop.compression.lzo.LzoCodec. 16/10/28 16:44:57 ERROR lzo.LzoIndexer: Error indexing hdfs://correct_path_to_file java.io.IOException: Could not find codec for file hdfs://correct_path_to_file - you may need to add the LZO codec to your io.compression.codecs configuration in core-site.xml at com.hadoop.compression.lzo.LzoIndex.createIndex(LzoIndex.java:212) at com.hadoop.compression.lzo.LzoIndexer.indexSingleFile(LzoIndexer.java:117) at com.hadoop.compression.lzo.LzoIndexer.indexInternal(LzoIndexer.java:98) at com.hadoop.compression.lzo.LzoIndexer.indexInternal(LzoIndexer.java:86) at com.hadoop.compression.lzo.LzoIndexer.index(LzoIndexer.java:52) at com.hadoop.compression.lzo.LzoIndexer.main(LzoIndexer.java:137) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) I get the feeling I'm missing a step somewhere. The shared libraries appear to be in place: [hirschs@sees24-lin native]$ rpm -ql hadoop-lzo-native /usr/hdp/current/share/lzo/0.6.0/lib/native /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64 /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.a /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.la /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.so /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.so.0 /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.so.0.0.0 /usr/hdp/current/share/lzo/0.6.0/lib/native/docs In core-site.xml: <property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.SnappyCodec,com.hadoop.compression.lzo.LzoCodec</value> </property> In hdfs-site.xml: <property> <name>io.compression.codec.lzo.class</name> <value>com.hadoop.compression.lzo.LzoCodec</value> </property> What more do I need to do in order for this to run? Even a guess would be helpful at this point.

hirschs · ‎09-15-2016

@ripunjay godhani I want to be sure I understand your post. Are you saying that modifying a single Ambari property will relocate logs for all components on a restart? If so, can share what the name of that property is? The page you linked to does not have a single mention of log location. In a perfect world, I would have left plenty of room under /var for logging, but we have a heavily used cluster with a lot of data and constant crashes from full /var on many of the machines. I need to move everything to a new location.

hirschs · ‎08-01-2016

@Benjamin Leonhardi Thanks for the explanation. We're having significant scaling issues with our 32-host Hortonworks HDP-2.3.2.0-2950 installation. How do I determine which version of ATS is installed? I do not see it listed in the 'Stacks and Versions' page in Ambari. Assuming we're running one of the troublesome versions, what's the most expedient way to disable reporting to ATS from Hive? Since it is useful for debugging, I'm hoping there's a session parameter we can set at query time to suppress reporting when performance is an issue.

hirschs · ‎07-28-2016

I'm curious what the advantage of suppressing Hive ATS reporting might be. From an esthetic standpoint we really don't want the UI filled up with myriads of successful, short-running queries, but it would be nice to switch it on in a case by case basis for debugging purposes. Beyond that, would turning it off improve query latency?

Online	Offline
Last Visited	‎11-03-2016 03:14 PM

Member Since	‎02-11-2016 06:43 PM
Last Visited	‎11-03-2016 03:14 PM
Posts	53
Kudos received	21

Cloudera Community

Re: Installing LZO compression broke Hive complete...

Re: Unable to use lzo codec

Re: HBase shell throws exception at startup

Re: Installing LZO compression broke Hive complete...

Re: Installing LZO compression broke Hive complete...

Re: Installing LZO compression broke Hive complete...

Re: Installing LZO compression broke Hive complete...

Installing LZO compression broke Hive completely

Re: Unable to use lzo codec

Unable to use lzo codec

Re: can i change log location in HDP installation

Re: How many HIVE concurrent queries can be execut...

Re: How many HIVE concurrent queries can be execut...