About JamesMillere

JamesMillere · ‎08-18-2023

Yes, Cloudera's management tools, especially Cloudera Manager, do provide insights and metrics about jobs running on Hadoop, including both MapReduce and Spark jobs. For job completion time estimation: Cloudera Manager: Within the Cloudera Manager interface, you can navigate to the specific service (like YARN or Spark) to view details about running or completed jobs. For each job, there's an estimated time of completion based on the progress and resources available. However, it's worth noting that these estimations can vary based on data skew, resource contention, and other factors. Resource Manager UI: For YARN based jobs, the YARN Resource Manager UI provides information about running applications, including their progress. The percentage completion might give a rough idea, but it doesn't directly estimate the completion time. Spark UI: For Spark jobs, the Spark UI provides insights into job stages, tasks, and their durations. While it doesn’t give a direct "time remaining" estimate, you can use the information about completed stages/tasks to infer how long the remaining stages/tasks might take. That being said, while these tools can provide some insights, predicting the exact completion time for distributed computing jobs can be challenging due to the dynamic nature of distributed resources, data imbalances, etc. To have more accurate estimations, it's recommended to: Monitor Resource Usage: Ensuring you have enough resources (memory, CPU, etc.) for your jobs. Optimize Your Jobs: Depending on the nature of your job, consider optimizing your code or the configuration. Historical Data: Look at the historical runtimes of similar jobs to provide a ballpark figure for future runs. I hope this provides clarity on your query. If you have any more questions or need further insights, please let us know. Resource- Cloudera

JamesMillere · ‎08-18-2023

Hi, If you're using Apache NiFi and the token you're trying to capture with the InvokeHTTP processor is too large to be stored as an attribute, you can follow the steps below to work around this limitation: Keep the token in the content of the FlowFile if it's returned by the InvokeHTTP processor. You can use processors like ReplaceText to wrap the token in the header format you need. For instance, if you need the header to be Authorization: Bearer {token}, then you can configure a ReplaceText processor to replace the content (i.e., the token) to match this format.

JamesMillere · ‎08-18-2023

Hi, I agree with the solution, however, use tail -f /var/log/ambari-server/ambari-server.log to watch the logs while you try to start the service from the UI. This will give you real-time feedback.

Online	Offline
Last Visited	‎08-23-2023 03:28 AM

Member Since	‎08-18-2023 02:25 AM
Last Visited	‎08-23-2023 03:28 AM
Posts	4
Kudos received	1

Cloudera Community

Re: Use InvokeHTTP response body in another Invoke...

Re: Is Cloudera have estimation time for jobs comp...

Re: Use InvokeHTTP response body in another Invoke...

Re: Memory corruption error while starting any ser...