Member since
07-05-2016
17
Posts
1
Kudos Received
0
Solutions
08-30-2017
04:04 PM
Also from your log and post hadoopctrl is namenode, resourcemanager, oozie. Is it data node and node manager also? It may be in a bottle neck with memory. Oozie trying to use the memory but yarn can not allocate memory or write the data. Potentially try moving your oozie serve to another node or reduce or redistribute to memory allocation, oozie usually doesn't need too much. This will probably explain the heart beat issue.
... View more
08-29-2017
09:07 PM
By doing hadoop fs -chmod -R 777 on your hive table, we can probably eliminate permission issues. This is a great puzzle. It should have been raised in the logs, but anything strange about your data? Nulls, NAs, Empty? strange date formats, decimals, special characters? Anything in @Artem Ervits post that helped?
... View more
08-28-2017
08:15 PM
From the log it seems that your sqoop job gets stuck with heart beat, heart beat... loop. This is a common result/problem if something has gone wrong. Do search 'oozie sqoop import heart beat'. But I believe it is potentially a permissions issue, as it has got through 95%. I suspect that when you run the sqoop job manually you run as 'hdfs' user. Can you confirm this? USER="hdfs" and realUser=oozie Is mentioned in the logs. I suspect the 'oozie' user does not have permission to overwrite the table. Check to permission of the table. Maybe change permission or ownership for diagnosis, and try again.
... View more
03-20-2017
02:45 PM
Thanks Mark. I have looked into your suggestions. Which has lead me to LZO Compression; http://blog.cloudera.com/blog/2009/11/hadoop-at-twitter-part-1-splittable-lzo-compression/ I think this may be something I try next. Do you have any suggestions with this? Doesn't HDP already comes with LZO? The link is a good few years old. should I try something else before I spend a few hows with this? My company is not keen on me spending a few hours writing Java sequenceFile jar.
... View more
01-25-2017
11:12 AM
So I have changed the way I tar.gz the files. At first I tried to create files of the size of 128mb (about 4 files), then 64mb (about 8-10 files), and then 1mb (100+). Obviously, this alters the amount of tasks that run. The task run faster the smaller the file, except one! One task always takes ~50mins. Why does this happen? How do I speed up this task?
... View more
01-18-2017
10:19 AM
Please see my post below
... View more
01-18-2017
10:19 AM
Please see my post below
... View more