Member since
10-13-2016
68
Posts
10
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2111 | 02-15-2019 11:50 AM | |
4641 | 10-12-2017 02:03 PM | |
873 | 10-13-2016 11:52 AM |
11-13-2018
06:55 AM
Eventually, after a restart of everything (not only the services seen as requiring a restart) it went OK.
... View more
11-01-2018
10:57 AM
Hello, I installed a new (not an update) HDP 3.0.1 and seem to have many issues with the timeline server. 1) The first weird thing is that the Yarn tab in ambari keeps showing this error: ATSv2 HBase Application The HBase application reported a 'STARTED' state. Check took 2.125s 2) The second issue seems to be with oozie. Running a job, it starts but stalls with the following log repeated hundreds of times 2018-11-01 11:15:37,842 INFO [Thread-82] org.apache.hadoop.yarn.event.AsyncDispatcher: Waiting for AsyncDispatcher to drain. Thread state is :WAITING
Then with: 2018-11-01 11:15:37,888 ERROR [Job ATS Event Dispatcher] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Exception while publishing configs on JOB_SUBMITTED Event for the job : job_1541066376053_0066
org.apache.hadoop.yarn.exceptions.YarnException: Failed while publishing entity
at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl$TimelineEntityDispatcher.dispatchEntities(TimelineV2ClientImpl.java:548)
at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl.putEntities(TimelineV2ClientImpl.java:149)
at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.publishConfigsOnJobSubmittedEvent(JobHistoryEventHandler.java:1254)
at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.processEventForNewTimelineService(JobHistoryEventHandler.java:1414)
at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleTimelineEvent(JobHistoryEventHandler.java:742)
at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.access$1200(JobHistoryEventHandler.java:93)
at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$ForwardingEventHandler.handle(JobHistoryEventHandler.java:1795)
at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$ForwardingEventHandler.handle(JobHistoryEventHandler.java:1791)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out 3) In hadoop-yarn-timelineserver-${hostname}.log I see, repeated many times: 2018-11-01 11:32:47,715 WARN timeline.EntityGroupFSTimelineStore (LogInfo.java:doParse(208)) - Error putting entity: dag_1541066376053_0144_2 (TEZ_DAG_ID): 6 4) In hadoop-yarn-timelinereader-${hostname}.log I see, repeated many times: Thu Nov 01 11:34:10 CET 2018, RpcRetryingCaller{globalStartTime=1541068444076, pause=1000, maxAttempts=4}, java.net.ConnectException: Call to /192.168.x.x:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /192.168.x.x:17020
at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:145)
at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80)
... 3 more
Caused by: java.net.ConnectException: Call to /192.168.x.x:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /192.168.x.x:17020
at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:165) and indeed, there is nothing listening to port 17020 on 192.168.x.x. 5) I cannot find on any server a process named ats-hbase, this might be the reason for everything else. The queue set is just yarn_hbase_system_service_queue_name=default, which has no limit which would prevent Hbase to start. I am sure that something is very wrong here, and any help would be appreciated.
... View more
Labels:
01-16-2018
06:46 AM
@Jordan Moore Not really relevant to the question but no this is not the point. The use case here is data export, where some clients have their own BI tools, processes and so on. They just need the data, csv in a zip file. Other clients do not have this in place and have a different access to this data.
... View more
01-15-2018
06:08 AM
The zip file is the output of the process, not to be read in hdfs anymore - it will just end up being downloaded and sent to a user. In this context using zip makes sense, as I am only looking at *compressing* multiple csv together, not reading them afterwards. Using beeline with formatted output is what I do currently, but I end up downloading locally multiple gigs, compress and re-upload. This is a waste and could actually fill my local disks up. Using coalesce in Spark is the best option I found, but the compression step is still not easy. Thanks!
... View more
01-09-2018
01:45 PM
1 Kudo
My end goal is to run a few hive queries, get 1 csv file (with headers) per query, compress all those files together in one zip (not gzip or bzip, unfortunately, needs to open natively under windows) and hopefully get the zip back in hdfs. My current solution (CTAS) ends up creating one directory per table, with possibly multiple files under it (depending on number of reducers and presence/absence of UNION). I can easily generate as well a header file per table with only one line in it. Now how to put all that together? The only option I could find implies to do all the processing locally (hdfs dfs -getmerge followed by a an actual zip command). This adds a lot of overhead and could technically fill up the local disk. So my questions are: is there a way to concatenate files inside hdfs without getting them locally? is there a way to compress a bunch of files together (not individually) in zip, inside hdfs? Thanks
... View more
Labels:
- Labels:
-
Apache Hadoop
10-12-2017
02:03 PM
The answer pointed at https://community.hortonworks.com/questions/57795/how-to-fix-under-replicated-blocks-fasly-its-take.html is the good one. Those are undocumented features in hadoop 2.7 but they can be set up and used and now I do see that replication is speed up.
... View more
10-12-2017
01:23 PM
1 Kudo
Hi, I had an issue with datanodes, resulting in having about 300k under replicated blocks. DNs are back, blocks are being replicated, but this is very slow, about 1 per second, and I am trying to find a way to speed replication up. I checked dfs.datanode.balance.bandwidthPerSec, which is set at about 6MB/seconds. My monitoring shows me that on average the rx/tx from each node is about 200k/seconds, so I am way below this limit. I followed this link https://community.hortonworks.com/articles/4427/fix-under-replicated-blocks-in-hdfs-manually.html which did not help (use setrep -w 3 on all underreplicated files ) This link https://community.hortonworks.com/questions/57795/how-to-fix-under-replicated-blocks-fasly-its-take.html is not fully applicable (hadoop 2.7) but I set fs.namenode.replication.work.multiplier.per.iteration to 100 (default is 2) without visible speed up. So my question is: what can I do to fasten replication up? Context: hdp2.6, AWS, 3 nodes and replication factor = 3.
... View more
Labels:
- Labels:
-
Apache Hadoop
10-04-2017
09:04 AM
1) was ok, 2) and 3) were not good and no it's not a kerberized cluster. After the fixes you suggested it all seems to work as expected. Thanks a million!
... View more
10-03-2017
01:38 PM
I am using hdp2.6 and would like to properly use Tez UI. The tez view is available, if I go there I see queries, can click on a query id and follow to the dag ID, but I do not have all I expect. DAG Details and DAG Counters look good. Graphical View tells me: Data not available to display graphical view!
No vertex data found in YARN Timeline Server. All Vertices, All Tasks and All Task Attempts tell me: No records available! Vertex Swimlane tells me: Data not available to display swimlane!
No vertex data found in YARN Timeline Server. I have seen the documentation relative to the manual install of HDP, saying to download a war file, but I do not believe this is what I should be doing here as I am using the ambari install from the cluster. tez.tez-ui.history-url.base is http://$ambari_ip:8080/#/main/view/TEZ/tez_cluster_instance which is indeed the URL where I can reach the tez view. Is there anything obvious I could have forgotten?
... View more
Labels:
09-28-2017
11:14 AM
Fair enough about the *.period. As I did get metrics there is probably a smart default, but nice to have. I indeed found some messages in the service logs, and all looks good. To be honest, it all worked today. I then happily applied the settings to prod, and lo and behold, I only have 2 metrics there. Carrying on thinking, I understood is that in metrics2.properties I say that I want for instance node manager metrics, but I then actually need to restart the node manages to see those metrics. Indeed, the cluster I worked on yesterday has been rebooted (dev cluster, switched off at night). Now all works as expected. Thanks!
... View more