About Harsh J

Harsh J · ‎12-08-2015

> As commands in shell scripts are only able to recognize hdfs directories This is an incorrect assumption. The shell action will merely execute any given script file (as normally executed from a process), and does not care about what is within it. Does your script fail with an error? If so, please post the error.

LovekeshBansal · ‎12-07-2015

Thanks For such an informative reply. I have already implemented s3a:// and yes only is the solution. The other one,i.e. changing to /tmp dir is an intelligent workaround.

Harsh J · ‎12-06-2015

I am not quite sure I follow what the problem is. Could you post the differing outputs or a screenshot thereof? This may not be the issue but note that printing the representation of the string in Python will not print out unicode characters (and instead print hexes).

Harsh J · ‎12-06-2015

> So, to generalize, the mechanism level subcodes can always be taken as some failure in communicating with KDC, right? Yes, it can be always taken as something wrong in the Kerberos layer (not necessarily only KDC, could also be things such as bad enctypes in keytab, etc., but always Kerberos mechanism related) > I also see that despite this error, ZK does continue to function ... so is this error to be really treated seriously? Did a retry of the auth perhaps succeed? Its not normal for it to repeat the errors.

nitin · ‎11-25-2015

When you ingest the data from an edge node that is also running datanode role, the 1st copy will always be written to that DN and it will use space much faster than any other datanode. To re-distribute space usage among all datanodes, you must run hdfs balancer.

James K · ‎11-23-2015

I'm running into something similar. I'm on 5.4.2 building tables with have then analyzing with Impala and I get the same warnings, although the queries execute ok. Can you please share with me what you scripted to make "when one partition is always less than 800MB I set the block size for this table to 1GB" as you mention in your post?

MartinEK · ‎11-19-2015

Thanks Harsh, yes that is helpful.

Harsh J · ‎11-16-2015

Your end point is incorrect - you're trying /jobs/ (which gives a list of WFs with high-level info) and not /job/WFID (gives a specific WF and all details). The latter is what you need. Do this: req = urllib2.Request('http://xx.xx.xxx.xx:11000/oozie/v1/job/0000096-151104073848042-oozie-hado-W') (Or use /jobs to iterate over the list of all WFs, calling /job/ID for each item's id field)

scratch28 · ‎11-11-2015

Thanks for the input I am suspecting that it may be the region settings I am using that prematurely splits regions contributing to the latency Originally I set the regions at default to 10 gb, shrank it to 4 gb did a compaction and flush, the first writes where trying to reorg the regions which is stemming from my first problem http://community.cloudera.com/t5/Storage-Random-Access-HDFS/hbase-closed-upon-write/m-p/34001#U34001 I then increased the regions to 4.9 gb seems to be fine now The reason I am staying under 5 gb, is so that I can transfer to S3 which has a file size limit

Harsh J · ‎11-11-2015

The only way I can think of to do that is to spawn a whole new application, impersonating the user, from within your AM. That may work, although I've never tried it.

Member Since	‎07-31-2013 07:21 AM
Last Visited
Posts	1,924
Kudos received	461

Cloudera Community

Re: S3Guard Suggested to help fix Consistency

Re: Failed to start namenode. java.io.FileNotFound...

Re: sqoop import issue

Re: Efficient ways to store many images files

Re: S3 loading into HDFS

Re: how to execute oozie shell action with script ...

Re: disk space issue on local disk.. due to buffer...

Re: a problem with the encoding in HBASE and pytho...

Re: Zookeeper kerberos issue or quorum issue?

Re: Force block redistribution for some particular...

Re: ERROR: Parquet file should not be split into m...

Re: File concatenation (HDFS-222)

Re: Oozie Workflow: Get running action name

Re: hbase warning response too slow

Re: How to set user in LinuxContainerExecutor from...