About antin_leszczysz

antin_leszczysz · ‎08-31-2017

You are welcome, glad you've got it sorted.

antin_leszczysz · ‎08-30-2017

Also from your log and post hadoopctrl is namenode, resourcemanager, oozie. Is it data node and node manager also? It may be in a bottle neck with memory. Oozie trying to use the memory but yarn can not allocate memory or write the data. Potentially try moving your oozie serve to another node or reduce or redistribute to memory allocation, oozie usually doesn't need too much. This will probably explain the heart beat issue.

antin_leszczysz · ‎08-29-2017

By doing hadoop fs -chmod -R 777 on your hive table, we can probably eliminate permission issues. This is a great puzzle. It should have been raised in the logs, but anything strange about your data? Nulls, NAs, Empty? strange date formats, decimals, special characters? Anything in @Artem Ervits post that helped?

antin_leszczysz · ‎08-29-2017

How are you getting on? Any luck with this?

antin_leszczysz · ‎08-28-2017

From the log it seems that your sqoop job gets stuck with heart beat, heart beat... loop. This is a common result/problem if something has gone wrong. Do search 'oozie sqoop import heart beat'. But I believe it is potentially a permissions issue, as it has got through 95%. I suspect that when you run the sqoop job manually you run as 'hdfs' user. Can you confirm this? USER="hdfs" and realUser=oozie Is mentioned in the logs. I suspect the 'oozie' user does not have permission to overwrite the table. Check to permission of the table. Maybe change permission or ownership for diagnosis, and try again.

antin_leszczysz · ‎08-28-2017

What do the yarn logs say? Could you post them?

antin_leszczysz · ‎03-20-2017

Thanks Mark. I have looked into your suggestions. Which has lead me to LZO Compression; http://blog.cloudera.com/blog/2009/11/hadoop-at-twitter-part-1-splittable-lzo-compression/ I think this may be something I try next. Do you have any suggestions with this? Doesn't HDP already comes with LZO? The link is a good few years old. should I try something else before I spend a few hows with this? My company is not keen on me spending a few hours writing Java sequenceFile jar.

antin_leszczysz · ‎01-25-2017

So I have changed the way I tar.gz the files. At first I tried to create files of the size of 128mb (about 4 files), then 64mb (about 8-10 files), and then 1mb (100+). Obviously, this alters the amount of tasks that run. The task run faster the smaller the file, except one! One task always takes ~50mins. Why does this happen? How do I speed up this task?

antin_leszczysz · ‎01-18-2017

Please see my post below

antin_leszczysz · ‎01-18-2017

Please see my post below

Online	Offline
Last Visited	‎04-17-2018 09:08 AM

Member Since	‎07-05-2016 08:20 AM
Last Visited	‎04-17-2018 09:08 AM
Posts	17
Kudos received	1

Cloudera Community

Re: oozie sqoop action hangs at 95%

Re: oozie sqoop action hangs at 95%

Re: oozie sqoop action hangs at 95%

Re: oozie sqoop action hangs at 95%

Re: oozie sqoop action hangs at 95%

Re: oozie sqoop action hangs at 95%

Re: com.databricks.spark.xml parsing xml takes a v...

Re: com.databricks.spark.xml parsing xml takes a v...

Re: com.databricks.spark.xml parsing xml takes a v...

Re: com.databricks.spark.xml parsing xml takes a v...