Created on 06-03-2015 02:35 PM - edited 09-16-2022 02:30 AM
Hi everyone,
everytime our data comes and new updates occur in our cluster, an undesirable file is being created in all workers' directories.In order to cleanup automatically I changed the variable value Spark (Standalone) Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh in Gateway Default Group->Advanced Settings as :
export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true -Dspark.worker.cleanup.interval=60 -Dspark.worker.cleanup.appDataTtl=60"
by using cloudera manager.After i make the cluster restart, it makes change in spark/conf/spark-env.sh but it does not make cleanup.Does anyone know where the mistake is or another way of cleaning up automatically ?
i am using CDH 4 and Spark 1.2.2 in the cluster.
Created 06-16-2015 08:03 PM
Specifying worker opts in the client does not really make sense. The worker needs to know what it needs to clean up and it should be set on the worker.
Try adding the whole string (that you have between the quotes to the "Additional Worker args" for the worker.
Wilfred
Created 11-30-2016 04:46 AM
Hi Team,
i am not able to find these properties in CM-5.8.2, could you please let me know where can i see or add these properties in CM
Regards,
Umesh
Created 11-30-2016 09:56 PM
Umesh,
Those settings are for Spark standalone clusters only. I would strongly advise you not to run standalone but use Spark on Yarn.
When you use Yarn the problem does not exist as Yarn handles it for you.
Wilfred
Created on 12-01-2016 04:50 AM - edited 12-01-2016 04:59 AM
Hi wilfred,
such are files being generated in the /tmp for us
1)
spark-cefa5a3d-bce2-45d2-9490-1ee19b9ac8b8
spark-d0806158-ece7-4d80-896b-a815e2e18e8a
spark-d0813ff3-9f4c-4fd8-8208-8596469e805e
spark-d1e55364-7207-4203-a583-df1face35096
spark-d26618a3-ba93-4d91-a5ea-b2d873735f97
spark-d49288de-a99d-4ede-af6f-fbf0a276c4e7
spark-d81273b6-f5da-4eed-a42e-913d414018cb
spark-df75486f-1f04-4838-bc07-4196172c42c8
spark-dfbbabf5-034e-47d5-9246-3edac5742558
spark-dfd16e79-6a67-4a7a-89dc-98d8c9f83df3
spark-e0783ace-5897-46a9-a073-4e4431a521f0
spark-e1429fea-160e-4d37-a349-053553a197a5
2)
f23f0701-76d3-4449-834e-d9ce33a009c3_resources
f268f04c-38cb-4b0a-9382-3c0fa5df0486_resources
f2a04084-dd40-4bd7-9243-25ce110fe10d_resources
f2bf7f23-eb3d-490f-a55f-be4984ab6858_resources
f2c0813b-adf2-400c-9f32-4b1e8619a5ed_resources
f2cee78a-b8cc-4d24-b10e-7955064e5a94_resources
f2ee0bdf-72a2-40d1-ae97-99036b81dd3d_resources
f2f60b55-c602-4fd9-a1f9-57973abf13c3_resources
f304c39b-8801-4d82-bc83-748e0a5720a9_resources
f310055d-f431-4749-8366-d708ca1eedd8_resources
f37d51bb-14d4-4ea7-9f8b-cdb5ca4d83cc_resources
f3a60213-8d7c-4183-be97-ee5a02bb165d_resources
regards,
Umesh
Created 12-01-2016 05:56 AM
Those directories should be cleaned up in current releases via either SPARK-7503 and or SPARK-7705. They are specific for yarn based setups. It should still happen automatically.
Wilfred
Created 12-01-2016 06:01 AM
Hi Wilfred,
thanks for your reply, but we are using spark1.6.0 in CDH-5.8.2 and we are still seeing these dirs are not removed automatically.
coudl you please confirm this is known bug in CDH-5.8.2 and spark-1.6.0
Regards,
Umesh
Created 12-01-2016 06:24 AM
No this is not a known issue as far as I know. If you have a support contract please open a support case and we can look into it further for you.
Wilfred