Member since
11-17-2015
33
Posts
12
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4724 | 06-20-2017 02:10 PM | |
83771 | 08-26-2016 01:14 PM | |
2690 | 07-03-2016 06:10 AM | |
37631 | 05-05-2016 02:58 PM | |
3203 | 05-04-2016 08:00 PM |
08-31-2017
06:30 PM
We either need to figure out how to get MirrorMaker 0.9.0 to use the new client api OR get MirrorMaker 0.10.1 to use the 0.9.0 compatible message format.
... View more
08-31-2017
05:11 PM
Hi, We need to use MirrorMaker to replicate data from 2 Kafka clusters running different versions. Source ------------------------------> Target Kafka 0.10.1 -> MirrorMaker -> Kafka 0.9.0 We are able to get this to work running MirrorMaker 0.9.0 using the "old zookeeper" consumer type, but would like to know if this is possible with MirrorMaker 0.10.1? We would like to limit the exposure of the source ZK cluster, hence the need to use the new MirrorMaker 0.10.1 consumer. Are there any consumer/producer configs we can use in MirrorMaker to allow us to use version 0.10.1?
# Exception in MirrorMaker when running version 0.10.1
[2017-08-31 16:43:15,799] ERROR Uncaught error in kafka producer I/O thread: (org.apache.kafka.clients.producer.internals.Sender) org.apache.kafka.common.protocol.types.SchemaException:
Error reading field 'brokers': Error reading field 'host': Error reading string
of length 26992, only 2176 bytes
Thanks,
Jon
... View more
Labels:
- Labels:
-
Apache Kafka
06-20-2017
02:10 PM
Here are the final hive configs that seem to have fixed this issue. Seems to be related to timeouts. set hive.execution.engine=mr;
set hive.default.fileformat=Orc;
set hive.exec.orc.default.compress=SNAPPY;
set hive.exec.copyfile.maxsize=1099511627776;
set hive.warehouse.subdir.inherit.perms=false;
set hive.metastore.pre.event.listeners=;
set hive.stats.fetch.partition.stats=false;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
set fs.trash.interval=0;
set fs.s3.buffer.dir=/tmp/s3a;
set fs.s3a.attempts.maximum=50;
set fs.s3a.connection.establish.timeout=120000;
set fs.s3a.connection.timeout=120000;
set fs.s3a.fast.upload=true;
set fs.s3a.fast.upload.buffer=disk;
set fs.s3a.multiobjectdelete.enable=true;
set fs.s3a.max.total.tasks=2000;
set fs.s3a.threads.core=30;
set fs.s3a.threads.max=512;
set fs.s3a.connection.maximum=30;
set fs.s3a.fast.upload.active.blocks=12;
set fs.s3a.threads.keepalivetime=120;
... View more
06-13-2017
03:35 PM
This seems to be random. Sometimes we see this error; if we run it again and it succeeds. Not sure why we're seeing it though. Here are the hive properties we're using: set hive.execution.engine=mr;
set hive.default.fileformat=Orc;
set hive.exec.orc.default.compress=SNAPPY;
set fs.s3a.attempts.maximum=50;
set fs.s3a.connection.establish.timeout=30000;
set fs.s3a.connection.timeout=30000;
set fs.s3a.fast.upload=true;
set fs.s3a.fast.upload.buffer=disk;
set fs.s3n.multipart.uploads.enabled=true;
set fs.s3a.threads.keepalivetime=60;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
We're running HDP 2.4.2 (HDP-2.4.2.0-258).
... View more
06-13-2017
02:38 PM
We are using Hive to load data to S3 (using s3a). We've started seeing the following error: 2017-06-13 08:51:49,042 ERROR [main]: exec.Task (SessionState.java:printError(962)) - Failed with exception Unable to unmarshall response (Failed to parse XML document with handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$CopyObjectResultHandler). Response Code: 200, Response Text: OK com.amazonaws.AmazonClientException: Unable to unmarshall response (Failed to parse XML document with handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$CopyObjectResultHandler). Response Code: 200, Response Text: OK at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:738) at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:399) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528) at com.amazonaws.services.s3.AmazonS3Client.copyObject(AmazonS3Client.java:1507) at com.amazonaws.services.s3.transfer.internal.CopyCallable.copyInOneChunk(CopyCallable.java:143) at com.amazonaws.services.s3.transfer.internal.CopyCallable.call(CopyCallable.java:131) at com.amazonaws.services.s3.transfer.internal.CopyMonitor.copy(CopyMonitor.java:189) at com.amazonaws.services.s3.transfer.internal.CopyMonitor.call(CopyMonitor.java:134) at com.amazonaws.services.s3.transfer.internal.CopyMonitor.call(CopyMonitor.java:46) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: com.amazonaws.AmazonClientException: Failed to parse XML document with handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$CopyObjectResultHandler at com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.parseXmlInputStream(XmlResponsesSaxParser.java:150) at com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.parseCopyObjectResponse(XmlResponsesSaxParser.java:417) at com.amazonaws.services.s3.model.transform.Unmarshallers$CopyObjectUnmarshaller.unmarshall(Unmarshallers.java:192) at com.amazonaws.services.s3.model.transform.Unmarshallers$CopyObjectUnmarshaller.unmarshall(Unmarshallers.java:189) at com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:62) at com.amazonaws.services.s3.internal.ResponseHeaderHandlerChain.handle(ResponseHeaderHandlerChain.java:44) at com.amazonaws.services.s3.internal.ResponseHeaderHandlerChain.handle(ResponseHeaderHandlerChain.java:30) at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:712) ... 13 more Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:170) at java.net.SocketInputStream.read(SocketInputStream.java:141) at sun.security.ssl.InputRecord.readFully(InputRecord.java:465) at sun.security.ssl.InputRecord.read(InputRecord.java:503) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973) at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930) at sun.security.ssl.AppInputStream.read(AppInputStream.java:105) at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166) at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90) at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281) at org.apache.http.impl.io.ChunkedInputStream.getChunkSize(ChunkedInputStream.java:251) at org.apache.http.impl.io.ChunkedInputStream.nextChunk(ChunkedInputStream.java:209) at org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:171) at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:138) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:161) at java.io.BufferedReader.read1(BufferedReader.java:212) at java.io.BufferedReader.read(BufferedReader.java:286) at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source) at org.apache.xerces.impl.XMLEntityScanner.skipSpaces(Unknown Source) at org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.parseXmlInputStream(XmlResponsesSaxParser.java:141) ... 20 more Anyone else seen this before? Is it a data size/length issue? Loading too much data at once? Timeout?
... View more
Labels:
- Labels:
-
Apache Hive
02-21-2017
03:39 PM
@Tony Bolt After you do the downconfig, do your configs look correct? If you're not upconfig'ing them to the correct location in ZK, solr won't see the correct version of your configs. Also, check in the ZK CLI to make sure you're using the right znode. If you're znode isn't /solr, then you'll need to adjust the above commands appropriately. And make sure solr is looking in the right znode. I believe my znode was /solr and my configs were in /solr/configs.
... View more
01-13-2017
11:58 PM
Storm Version: 0.10.0.2.4 Using a Kafka Spout. How does storm handle failed tuples? How many times will storm retry a failed tuple? What frequency will storm retry the failed tuple? What is the max tuple count a topology can handle between all spouts and bolts?
... View more
Labels:
- Labels:
-
Apache Kafka
-
Apache Storm
12-01-2016
05:03 PM
The user's saved queries weren't in this table. Which explains why they aren't seeing them. I opened one of our nightly pg dumps and pulled the user's query file location from the ds_savedquery_* table and cat'd them from hdfs and sent the output to the user.
cat hdfs_files.out
/user/xxxxxxx/hive/jobs/hive-job-813-2016-07-28_11-46/query.hql
/user/xxxxxxx/hive/jobs/hive-job-1952-2016-10-18_09-31/query.hql
...
for f in `cat hdfs_files.out`;do
> hdfs dfs -cat $f >> saved_queries.hql
> echo >> saved_queries.hql
> echo >> saved_queries.hql
> done
Thanks @jss for your help with this.
... View more
12-01-2016
03:41 PM
Thanks @jss! Which column in ds_jobimpl_* references that the query is a saved one? The user doesn't have 2100+ saved queries. This looks more like job history.
select count(*) from ds_jobimpl_6 where ds_owner = 'xxxxxxx';
2182
\d ds_jobimpl_6
ds_id | character varying(255) | not null
ds_applicationid | character varying(3000) |
ds_conffile | character varying(3000) |
ds_dagid | character varying(3000) |
ds_dagname | character varying(3000) |
ds_database | character varying(3000) |
ds_datesubmitted | bigint |
ds_duration | bigint |
ds_forcedcontent | character varying(3000) |
ds_globalsettings | character varying(3000) |
ds_logfile | character varying(3000) |
ds_owner | character varying(3000) |
ds_queryfile | character varying(3000) |
ds_queryid | character varying(3000) |
ds_referrer | character varying(3000) |
ds_sessiontag | character varying(3000) |
ds_sqlstate | character varying(3000) |
ds_status | character varying(3000) |
ds_statusdir | character varying(3000) |
ds_statusmessage | character varying(3000) |
ds_title | character varying(3000) |
hdfs dfs -find /user/xxxxxxx -name *.hql | wc -l
2546
... View more
12-01-2016
03:08 PM
Also, is there a difference in location between, ambari-only users vs ambari/linux users? (Still referring to the hive view) linux user = the user has an account on the linux box ambari user = the user has an ambari account We have a user that only had an ambari account. They seem to have lost their queries after we created a linux account for them. Both ambari and linux account names are the same.
... View more