About Mac Noland BS

Mac Noland BS · ‎03-30-2016

Good day. We recently lowered our Hue Session timeout (i.e. ttl) to be 10 minutes per our security team's recommendation. One of the things we've found out is that the cookie is not extended during activity. That is, if a user logs in and works for 10 minutes, they will be forced to log in after 10 minutes - no mater what. Is there any way, or thoughts for the future, to modify this behavior so its 10 minutes of 'inactivity'. I know defining 'inactivity' is hard to do, but wanted to check and see if anyone had thoughts on this.

Mac Noland BS · ‎02-12-2016

Thanks!

Mac Noland BS · ‎02-03-2016

Thanks. Indeed that is exactly what we are looking for. I realize this is maybe something you don't know, but any idea if Impala 2.5.0 might be put in a 5.5.* release, or if this might be back ported to a 2.3 release of Impala with CDH 5.5.*? We're at 5.4.7 right now and I've been looking for opportunities to the team to buy off on 5.5. so just checking to see if this might be an arrow in my life-cycle management quiver. If you don't know, no worries.

Mac Noland BS · ‎02-03-2016

Good afternoon, We are making use of the unix_timestamp() function to take different dates in string format and getting them to a timestamp. Unfortunately the date format we are getting in via CSV files is coming in as something like 1/2/2007 where the month and date are not padded with a ‘0’ for single digit days or months. For Hive, it looks like not having the padding works fine. #hive select unix_timestamp('01/02/2007', 'MM/dd/yyyy'); // returns 1167696000 select unix_timestamp('1/2/2007', 'MM/dd/yyyy'); // returns 1167696000 But in Impala if the format is not padded with ‘0’ then we get null. #impala select unix_timestamp('01/02/2007', 'MM/dd/yyyy'); // returns 1167696000 select unix_timestamp('1/2/2007', 'MM/dd/yyyy'); // returns null And if we were to change the format to just single ‘d’ or ‘M’ it works for the non-padded sources, but obviously returns null for the two digit formats. #impala select unix_timestamp('1/2/2007', 'M/d/yyyy'); // returns 1167696000 select unix_timestamp('10/20/2007', 'M/d/yyyy'); // returns null One work around we do in other areas is basically ingest the data and then use Hive to transform the data into a new table. However we have a use case currently where we would like to put Impala directly on top of the CSV file and not do a transformation. We think what we’ll end up doing is see if the creater of the CSV can create the dates with padded ‘0’s, or just treat them as strings. Before going down this route, we wanted to see if the observations we are noticing in Impala is expected, or if we’re missing anything. Thanks in advance, Mac

Mac Noland BS · ‎12-10-2015

As it so often happens, I went out for a walk and came back to look at a few other things. And sure enough, I now see this would be how you tune the job. I guess if any good can come from my lack of attention to detail, at least I now have it engraved in my mind. sqoop import -D mapreduce.map.memory.mb=4096 -D mapreduce.map.java.opts=-Xmx3000m ....

Mac Noland BS · ‎12-10-2015

We have a Sqoop 1 query that is throwing a " Error: Java heap space" message on both our Sqoop driver and the Map/Reduce jobs running under Yarn. We were able to increase the Sqoop driver heap by setting HADOOP_HEAPSIZE to 2MB and that has solved the initial issue. It looks like the way the scripts work, you just need to pass in the megabytes and the scrpit prefixes -Xms and adds 'm' at the end. export HADOOP_HEAPSIZE=2000 sqoop import ...... However, we can't find the correct place to set what we presume is the container memory and actual task process heap size configuration. Our cluster is currently configured with the following settings for Yarn. These are set via Cloudera Manager and are stored in the mapred-site.xml file. We don't want to adjust the entire cluster setting as these work fine for 99% of the jobs we run. We just have one problem child that we'd like to tune. mapreduce.map.memory.mb=1024 mapreduce.map.java.opts=-Djava.net.preferIPv4Stack=true -Xmx825955249 We have tried the following without any luck. Is there any other suggestions for where we should be configuring these two settings for Sqoop 1 initiated jobs? export HADOOP_OPTS="-Dmapreduce.map.memory.mb=2000 -Dmapreduce.map.java.opts=-Xmx1500m" export HADOOP_CLIENT_OPTS="-Dmapreduce.map.memory.mb=2000 -Dmapreduce.map.java.opts=-Xmx1500m" export YARN_OPTS="-Dmapreduce.map.memory.mb=2000 -Dmapreduce.map.java.opts=-Xmx1500m" export YARN_CLIENT_OPTS="-Dmapreduce.map.memory.mb=2000 -Dmapreduce.map.java.opts=-Xmx1500m" sqoop import ....

Mac Noland BS · ‎12-02-2015

Thanks! Indeed that seemed to take care of it. Pretty slick. You guys did a very good job on making that easy for us. Now to start testing. Thanks again!

Mac Noland BS · ‎12-02-2015

Good day. We are working on a project to put our Impalad nodes behind a load balancer. We are running CM/CDH 5.4.5 and using the instructions found here. http://www.cloudera.com/content/www/en-us/documentation/enterprise/5-4-x/topics/impala_proxy.html#proxy_kerberos_unique_1 From what we observed, it looks like when using CM, after entering our load balancer dns:port (e.g. our.load.balancer.company.com:25003) into the "Impala Daemons Load Balancer" section, CM does most all the hard work of merging keytabs and setting the custom command line arguments (e.g. --be_principal) for us. This is great. Unfortunately after adding in our load balancer we are running into the error: "Role is missing Kerberos keytab." in CM. Now, one of the things we were unsure of is whether CM will actually create the principal and keytab for the load balancer. For example, when looking at our CM managed Kerberos principals, there is no "impala/our.load.balancer.company.com@COMPANY.COM" Any idea if we need to pre-create and manage the 'impala/our.load.balancer.company.com@COMPANY.COM' outside of CM? Or would you expect CM to actually create that principal for us as part of adding in the “Impala Daemons Load Balancer” configuration? Thanks in advance.

Online	Offline
Last Visited	‎11-01-2018 10:10 AM

Member Since	‎08-20-2015 09:24 AM
Last Visited	‎11-01-2018 10:10 AM
Posts	23
Kudos received	7

Cloudera Community

Re: Controlling Yarn Map and Reduce Memory Configu...

Hue Session Timeout

Re: Formats for unix_timestamp() function in Impal...

Re: Formats for unix_timestamp() function in Impal...

Formats for unix_timestamp() function in Impala

Re: Controlling Yarn Map and Reduce Memory Configu...

Controlling Yarn Map and Reduce Memory Configurati...

Re: Putting Impala Behind A Load Balancer - Kerber...

Putting Impala Behind A Load Balancer - Kerberos P...