Member since
11-30-2018
7
Posts
0
Kudos Received
0
Solutions
01-06-2019
12:31 PM
Thanks for your reply, but we are not copying to S3 here. We are copying to another cluster. The reference article discusses ways to speed up copy to S3. do you know how we can speed up copying to another cluster and not S3
... View more
01-03-2019
09:53 AM
We are daily copying 100+ tables/ data between production cluster and DR cluster. these tables are growing 0.5-1% daily. The tables that used to take 5 mins few months back are now taking 15 minutes to copy. We understand volume growth can cause performance issues but not to this drastic extend. Collectively, this is causing lot of delay. Copying entire database is now taking 3-4 more hours Can you please suggest how can we improve performance of the copy operation
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache YARN
01-02-2019
11:54 AM
We are trying to find the top 10 queries in hive that use lot of resources ( memory CPU etc ) we go this data using history server API and using RM but we only have the application id from the above data and do not have the hive query that we need to work on. can you please help me to meet this use case.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
-
Apache YARN
11-30-2018
02:16 PM
Hello All I am trying to create a script that can pull all the resource manager / history server data via api for a period of 24 hours. I want the output in json format and then later i can parse and persist which can be used for trend analysis. Any idea, how can I proceed on that. any pointers will be very helful. thanks.
... View more
Labels:
- Labels:
-
Apache YARN
-
Cloudera Manager