Member since
11-17-2015
33
Posts
12
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4575 | 06-20-2017 02:10 PM | |
83009 | 08-26-2016 01:14 PM | |
2625 | 07-03-2016 06:10 AM | |
37223 | 05-05-2016 02:58 PM | |
3123 | 05-04-2016 08:00 PM |
08-31-2017
06:30 PM
We either need to figure out how to get MirrorMaker 0.9.0 to use the new client api OR get MirrorMaker 0.10.1 to use the 0.9.0 compatible message format.
... View more
06-29-2017
07:45 PM
That error from AWS suspected to be the S3 connection being broken, and the XML parser in the Amazon SDK getting the end of the document & failing. I'm surprised you are seeing it frequently though; it's generally pretty rare (i.e. rare enough that we've not got that much details on what is going on). It might be fs.s3a.connection.timeout is the parameter to tune, but the other possiblity is that you have too many threads/tasks talking to S3 and either your network bandwidth is used up or AWS S3 is actually throttling you. Try smaller values of fs.s3a.threads.max (say 64 or fewer) and of fs.s3a.max.total.tasks (try 128). That cuts down the # of threads which may write at a time, and then has a smaller queue of waiting blocks to write before it blocks whatever thread is actually generating lots of of data.
... View more
01-23-2017
01:36 AM
1 Kudo
Good write-up from @Ambud Sharma plus you can visit http://storm.apache.org/releases/1.0.2/Guaranteeing-message-processing.html for info from the source. Additionally, take a peek at the picture below I just exported from our http://hortonworks.com/training/class/hdp-developer-storm-and-trident-fundamentals/ course that might help visualize all of this information. Good luck and happy Storming!
... View more
12-01-2016
05:03 PM
The user's saved queries weren't in this table. Which explains why they aren't seeing them. I opened one of our nightly pg dumps and pulled the user's query file location from the ds_savedquery_* table and cat'd them from hdfs and sent the output to the user.
cat hdfs_files.out
/user/xxxxxxx/hive/jobs/hive-job-813-2016-07-28_11-46/query.hql
/user/xxxxxxx/hive/jobs/hive-job-1952-2016-10-18_09-31/query.hql
...
for f in `cat hdfs_files.out`;do
> hdfs dfs -cat $f >> saved_queries.hql
> echo >> saved_queries.hql
> echo >> saved_queries.hql
> done
Thanks @jss for your help with this.
... View more
08-16-2016
12:16 PM
My comment - on Hadoop 2.4 you must to change limit for max open files with Ambari (hive -> advanced hive-env hive_user_nofile_limit = 64000)
... View more
02-22-2017
11:11 PM
@Jon Maestas I'm still having the same problem. I've tried clearing the config and upconfiging multiple times. In every instance the solrconfig.xml looks fine from the Solr UI. The HDS stuff seems to be working OK. i.e. when I create the collection, the expected directories and files are created in HDFS. It is only after that, when SOLR tries to instantiate the updateHandler that we get the unKnownHostException refering to our HDFS Nameservice Name. Unfortuanatley we changed multiple things going in here. Everything was working fine on Solr Version 5.3.1 and the embedded zookeeper. This problem has arisen when we went to SOLR 6.4.1 but we simultaneously switched to using the Hadoop cluster's existing zookeeper quorum. We have the /solr chroot setup in Zookeeper and it is referenced consistently across all the Solr config files and commands. Our next step is to start backing out our changes (which is a pain becuase we want some of the security enhancements in 6.4.1 In your examples (above) you use $zk_quorum. IS that set to the name of a single zookeeper node (or is it a list of all the nodes) I've tried both approaches but it doesn't make any difference. Thanks Tony
... View more