About Jonathan_M_Maes

Jonathan_M_Maes · ‎08-31-2017

We either need to figure out how to get MirrorMaker 0.9.0 to use the new client api OR get MirrorMaker 0.10.1 to use the 0.9.0 compatible message format.

Jonathan_M_Maes · ‎08-31-2017

Hi, We need to use MirrorMaker to replicate data from 2 Kafka clusters running different versions. Source ------------------------------> Target Kafka 0.10.1 -> MirrorMaker -> Kafka 0.9.0 We are able to get this to work running MirrorMaker 0.9.0 using the "old zookeeper" consumer type, but would like to know if this is possible with MirrorMaker 0.10.1? We would like to limit the exposure of the source ZK cluster, hence the need to use the new MirrorMaker 0.10.1 consumer. Are there any consumer/producer configs we can use in MirrorMaker to allow us to use version 0.10.1? # Exception in MirrorMaker when running version 0.10.1 [2017-08-31 16:43:15,799] ERROR Uncaught error in kafka producer I/O thread: (org.apache.kafka.clients.producer.internals.Sender) org.apache.kafka.common.protocol.types.SchemaException: Error reading field 'brokers': Error reading field 'host': Error reading string of length 26992, only 2176 bytes Thanks, Jon

Jonathan_M_Maes · ‎06-20-2017

Here are the final hive configs that seem to have fixed this issue. Seems to be related to timeouts. set hive.execution.engine=mr; set hive.default.fileformat=Orc; set hive.exec.orc.default.compress=SNAPPY; set hive.exec.copyfile.maxsize=1099511627776; set hive.warehouse.subdir.inherit.perms=false; set hive.metastore.pre.event.listeners=; set hive.stats.fetch.partition.stats=false; set hive.exec.dynamic.partition.mode=nonstrict; set hive.exec.dynamic.partition=true; set fs.trash.interval=0; set fs.s3.buffer.dir=/tmp/s3a; set fs.s3a.attempts.maximum=50; set fs.s3a.connection.establish.timeout=120000; set fs.s3a.connection.timeout=120000; set fs.s3a.fast.upload=true; set fs.s3a.fast.upload.buffer=disk; set fs.s3a.multiobjectdelete.enable=true; set fs.s3a.max.total.tasks=2000; set fs.s3a.threads.core=30; set fs.s3a.threads.max=512; set fs.s3a.connection.maximum=30; set fs.s3a.fast.upload.active.blocks=12; set fs.s3a.threads.keepalivetime=120;

Jonathan_M_Maes · ‎06-13-2017

This seems to be random. Sometimes we see this error; if we run it again and it succeeds. Not sure why we're seeing it though. Here are the hive properties we're using: set hive.execution.engine=mr; set hive.default.fileformat=Orc; set hive.exec.orc.default.compress=SNAPPY; set fs.s3a.attempts.maximum=50; set fs.s3a.connection.establish.timeout=30000; set fs.s3a.connection.timeout=30000; set fs.s3a.fast.upload=true; set fs.s3a.fast.upload.buffer=disk; set fs.s3n.multipart.uploads.enabled=true; set fs.s3a.threads.keepalivetime=60; set hive.exec.dynamic.partition.mode=nonstrict; set hive.exec.dynamic.partition=true; We're running HDP 2.4.2 (HDP-2.4.2.0-258).

Jonathan_M_Maes · ‎06-13-2017

We are using Hive to load data to S3 (using s3a). We've started seeing the following error: 2017-06-13 08:51:49,042 ERROR [main]: exec.Task (SessionState.java:printError(962)) - Failed with exception Unable to unmarshall response (Failed to parse XML document with handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$CopyObjectResultHandler). Response Code: 200, Response Text: OK com.amazonaws.AmazonClientException: Unable to unmarshall response (Failed to parse XML document with handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$CopyObjectResultHandler). Response Code: 200, Response Text: OK at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:738) at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:399) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528) at com.amazonaws.services.s3.AmazonS3Client.copyObject(AmazonS3Client.java:1507) at com.amazonaws.services.s3.transfer.internal.CopyCallable.copyInOneChunk(CopyCallable.java:143) at com.amazonaws.services.s3.transfer.internal.CopyCallable.call(CopyCallable.java:131) at com.amazonaws.services.s3.transfer.internal.CopyMonitor.copy(CopyMonitor.java:189) at com.amazonaws.services.s3.transfer.internal.CopyMonitor.call(CopyMonitor.java:134) at com.amazonaws.services.s3.transfer.internal.CopyMonitor.call(CopyMonitor.java:46) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: com.amazonaws.AmazonClientException: Failed to parse XML document with handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$CopyObjectResultHandler at com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.parseXmlInputStream(XmlResponsesSaxParser.java:150) at com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.parseCopyObjectResponse(XmlResponsesSaxParser.java:417) at com.amazonaws.services.s3.model.transform.Unmarshallers$CopyObjectUnmarshaller.unmarshall(Unmarshallers.java:192) at com.amazonaws.services.s3.model.transform.Unmarshallers$CopyObjectUnmarshaller.unmarshall(Unmarshallers.java:189) at com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:62) at com.amazonaws.services.s3.internal.ResponseHeaderHandlerChain.handle(ResponseHeaderHandlerChain.java:44) at com.amazonaws.services.s3.internal.ResponseHeaderHandlerChain.handle(ResponseHeaderHandlerChain.java:30) at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:712) ... 13 more Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:170) at java.net.SocketInputStream.read(SocketInputStream.java:141) at sun.security.ssl.InputRecord.readFully(InputRecord.java:465) at sun.security.ssl.InputRecord.read(InputRecord.java:503) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973) at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930) at sun.security.ssl.AppInputStream.read(AppInputStream.java:105) at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166) at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90) at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281) at org.apache.http.impl.io.ChunkedInputStream.getChunkSize(ChunkedInputStream.java:251) at org.apache.http.impl.io.ChunkedInputStream.nextChunk(ChunkedInputStream.java:209) at org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:171) at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:138) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:161) at java.io.BufferedReader.read1(BufferedReader.java:212) at java.io.BufferedReader.read(BufferedReader.java:286) at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source) at org.apache.xerces.impl.XMLEntityScanner.skipSpaces(Unknown Source) at org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.parseXmlInputStream(XmlResponsesSaxParser.java:141) ... 20 more Anyone else seen this before? Is it a data size/length issue? Loading too much data at once? Timeout?

Jonathan_M_Maes · ‎02-21-2017

@Tony Bolt After you do the downconfig, do your configs look correct? If you're not upconfig'ing them to the correct location in ZK, solr won't see the correct version of your configs. Also, check in the ZK CLI to make sure you're using the right znode. If you're znode isn't /solr, then you'll need to adjust the above commands appropriately. And make sure solr is looking in the right znode. I believe my znode was /solr and my configs were in /solr/configs.

Jonathan_M_Maes · ‎01-13-2017

Storm Version: 0.10.0.2.4 Using a Kafka Spout. How does storm handle failed tuples? How many times will storm retry a failed tuple? What frequency will storm retry the failed tuple? What is the max tuple count a topology can handle between all spouts and bolts?

Jonathan_M_Maes · ‎12-01-2016

The user's saved queries weren't in this table. Which explains why they aren't seeing them. I opened one of our nightly pg dumps and pulled the user's query file location from the ds_savedquery_* table and cat'd them from hdfs and sent the output to the user. cat hdfs_files.out /user/xxxxxxx/hive/jobs/hive-job-813-2016-07-28_11-46/query.hql /user/xxxxxxx/hive/jobs/hive-job-1952-2016-10-18_09-31/query.hql ... for f in `cat hdfs_files.out`;do > hdfs dfs -cat $f >> saved_queries.hql > echo >> saved_queries.hql > echo >> saved_queries.hql > done Thanks @jss for your help with this.

Jonathan_M_Maes · ‎12-01-2016

Jonathan_M_Maes · ‎12-01-2016

Also, is there a difference in location between, ambari-only users vs ambari/linux users? (Still referring to the hive view) linux user = the user has an account on the linux box ambari user = the user has an ambari account We have a user that only had an ambari account. They seem to have lost their queries after we created a linux account for them. Both ambari and linux account names are the same.

Online	Offline
Last Visited	‎10-29-2018 12:52 PM

Member Since	‎11-17-2015 09:15 PM
Last Visited	‎10-29-2018 12:52 PM
Posts	33
Kudos received	12

Cloudera Community

Re: Hive to S3 Error - timeout?

Re: How to run all of hive queries in a file in on...

Re: HDPSearch - failed to create collection - Unkn...

Re: PSQLException: FATAL: no pg_hba.conf entry for...

Re: Ambari is overiding the config change

Re: Kafka MirrorMaker - Mirroring different Kafka ...

Kafka MirrorMaker - Mirroring different Kafka vers...

Re: Hive to S3 Error - timeout?

Re: Hive to S3 Error - timeout?

Hive to S3 Error - timeout?

Re: HDPSearch - failed to create collection - Unkn...

How does storm handle failed tuples?

Re: Ambari Hive View - User's Saved Queries Locati...

Re: Ambari Hive View - User's Saved Queries Locati...

Re: Ambari Hive View - User's Saved Queries Locati...