Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Migrate yarn logs from CDH 6.3.2 to CDP 7.1.6

avatar
New Contributor

Hey everyone.

I'm in the process of migrating data from a CDH 6.3.2 cluster to CDP 7.1.6.

 

I'm using distcp to migrate data as it seems to be impossible to perform in-place migrations from CDH 6.3.2 and CDP 7.1.6 (according to cloudera documentation -- https://docs.cloudera.com/cdp-private-cloud/latest/upgrade/topics/cdpdc-upgrade-paths.html).  

 

I migrated all the resources that I need from HDFS but now I would like to also migrate the yarn application logs (the ones that we can access in resource manager UI or using yarn logs command). 

I tried to use distcp to copy the logs from the old cluster to the new, to the directory that is configurable using yarn-site.xml. However, the migrated logs don't appear in resource manager UI neither using yarn logs -applicationId <app-id>. There is any way to make the logs from the old cluster available in the new cluster and accessible via resource manager or using the yarn logs command?

 

Thanks in advance.

Cheers!

1 ACCEPTED SOLUTION

avatar
Moderator

Hi @apedroso ,

 

thank you for starting this thread.

 

In this reply, I will focus on how the YARN RM stores data about historical applications, which can be accessed via the RM Web UI.

 

The RM keeps data about the applications in its state store [1].

It can be LeveldbRMStateStore, FileSystemRMStateStore or ZKRMStateStore.

 

We recommend using ZKRMStateStore (this is what we use in YARN HA as well), because it is a more robust implementation. For example, you can migrate in RM HA standby RM while the active RM is still running and keep the state-store intact.

 

Because the RM Web UI is reading the data from the state-store, it is independent of the presence or lack of YARN Application Logs.

 

What are your exact migration steps? Do I understand correctly that you upgrade your cluster to CDP or do you need to move services to a new cluster, please?

 

[1] Please read section for "yarn.resourcemanager.store.class" in https://hadoop.apache.org/docs/r3.1.2/hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html


Ferenc Erdelyi, Technical Solutions Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

View solution in original post

1 REPLY 1

avatar
Moderator

Hi @apedroso ,

 

thank you for starting this thread.

 

In this reply, I will focus on how the YARN RM stores data about historical applications, which can be accessed via the RM Web UI.

 

The RM keeps data about the applications in its state store [1].

It can be LeveldbRMStateStore, FileSystemRMStateStore or ZKRMStateStore.

 

We recommend using ZKRMStateStore (this is what we use in YARN HA as well), because it is a more robust implementation. For example, you can migrate in RM HA standby RM while the active RM is still running and keep the state-store intact.

 

Because the RM Web UI is reading the data from the state-store, it is independent of the presence or lack of YARN Application Logs.

 

What are your exact migration steps? Do I understand correctly that you upgrade your cluster to CDP or do you need to move services to a new cluster, please?

 

[1] Please read section for "yarn.resourcemanager.store.class" in https://hadoop.apache.org/docs/r3.1.2/hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html


Ferenc Erdelyi, Technical Solutions Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community: