Answer
New Contributor
Posts: 2
Registered: ‎12-08-2017
Run a spark job on Altus on Azure with jars on AzureFS

Hello,

 

I try to run a spark job on Altus on Azure with jars on Azure FS

Altus CLI command looks like :

 

Hello,

 

I try to run a spark job on Altus on Azure with jars on Azure FS

Altus CLI command looks like :

 

altus dataeng submit-jobs --cluster-name lbourgeois-rd2 
--jobs 
{"name": "LOCAL_PROJECT_altus_azure_sp_0_1_20171222111323","sparkJob": {
"jars": ["wasb://lbourgeoisstoragecontainer@lbourgeoisstorageaccount.blob.core.windows.net//remoteFolder/jar/altus_azure_sp_0_1.jar", 
"wasb://lbourgeoisstoragecontainer@lbourgeoisstorageaccount.blob.core.windows.net//remoteFolder/libjars/talend-bigdata-launcher-1.2.2-20171206.jar", 
"wasb://lbourgeoisstoragecontainer@lbourgeoisstorageaccount.blob.core.windows.net//remoteFolder/libjars/hadoop-azure-2.7.4.jar", 
"wasb://lbourgeoisstoragecontainer@lbourgeoisstorageaccount.blob.core.windows.net//remoteFolder/libjars/dom4j-1.6.1.jar", 
"wasb://lbourgeoisstoragecontainer@lbourgeoisstorageaccount.blob.core.windows.net//remoteFolder/libjars/log4j-1.2.16.jar", 
"wasb://lbourgeoisstoragecontainer@lbourgeoisstorageaccount.blob.core.windows.net//remoteFolder/libjars/azure-storage-2.2.0.jar", 
"wasb://lbourgeoisstoragecontainer@lbourgeoisstorageaccount.blob.core.windows.net//remoteFolder/libjars/antlr-runtime-3.5.2.jar", 
"wasb://lbourgeoisstoragecontainer@lbourgeoisstorageaccount.blob.core.windows.net//remoteFolder/libjars/talend-mapred-lib.jar", 
"wasb://lbourgeoisstoragecontainer@lbourgeoisstorageaccount.blob.core.windows.net//remoteFolder/libjars/org.talend.dataquality.parser.jar", 
"wasb://lbourgeoisstoragecontainer@lbourgeoisstorageaccount.blob.core.windows.net//remoteFolder/libjars/jetty-util-6.1.26.cloudera.2.jar", 
"wasb://lbourgeoisstoragecontainer@lbourgeoisstorageaccount.blob.core.windows.net//remoteFolder/libjars/talend_file_enhanced_20070724.jar", 
"wasb://lbourgeoisstoragecontainer@lbourgeoisstorageaccount.blob.core.windows.net//remoteFolder/libjars/routines-6.2.0.jar", 
"wasb://lbourgeoisstoragecontainer@lbourgeoisstorageaccount.blob.core.windows.net//remoteFolder/libjars/altus_azure_sp_0_1.jar"],"applicationArguments":["--context=Default", "-calledByAltus"],"sparkArguments":"
--conf spark.kryo.registrator=local_project.altus_azure_sp_0_1.altus_azure_sp$TalendKryoRegistrator 
--conf spark.serializer=org.apache.spark.serializer.KryoSerializer --conf spark.app.name=LOCAL_PROJECT_altus_azure_sp_0.1 --conf spark.yarn.submit.waitAppCompletion=true ", 
"mainClass":"local_project.altus_azure_sp_0_1.altus_azure_sp"}

 

Jars are on AzureFS

The job is reported as FAILED (INTERNAL_ERROR) in Altus console and can't see the YARN app in YARN RM web UI nor any error in the logs.

 

Is it possible to have Jars on AzureFS like I did ?

 

regards

 

Laurent

View Entire Topic
Cloudera Employee
Posts: 1
Registered: ‎01-14-2016
Answered

Hi Laurent,

 

Cloudera Altus does not support Azure FS (wasb://). You can use Azure Data Lake Store (ADLS) for storing the jar files and data (the uri will have adl:// prefix). You can reference the Altus (Azure) documentation (accessible through Altus console) for details on how to configure ADLS to allow Altus access.

 

Thanks,

Tony Wu

Altus Community Navigation