About priyal

priyal · ‎07-05-2018

Hi, @Felix Albani Thanks.

priyal · ‎06-28-2018

Hi, I want to fetch stock exchange data from Alpha vantage API using Spark Streaming. I used below API which return data in JSON format : https://www.alphavantage.co/query?function=TIME_SERIES_INTRADAY&symbol=TCS&interval=1min&apikey=apikey How to fetch continuous streaming of stock exchange data using Spark Streaming Java API.

priyal · ‎06-14-2018

Hi, @Felix Albani I set driver memory to 20 GB.I tried using below spark-submit parameters : ./bin/spark-submit --driver-memory 20g --executor-cores 3 --num-executors 20 --executor-memory 2g --conf spark.yarn.executor.memoryOverhead=1024 --conf spark.yarn.driver.memoryOverhead=1024 --class org.apache.TransformationOper --master yarn-cluster /home/hdfs/priyal/spark/TransformationOper.jar Cluster configuration is : 1 Master node(r3.xlarge) and 1 worker node(r3.xlarge) : 4 vCPUs, 30GB memory,40 GB storage Still getting the same issue spark job is in running state and YARN memory is 95% used.

priyal · ‎06-13-2018

Hi, @Vinicius Higa Murakami , @Felix Albani I have set spark.yarn.driver.memoryOverhead=1 GB,spark.yarn.executor.memoryOverhead=1 GB and spark_driver_memory=12 GB. I have set storage level to MEMORY_AND_DISK_SER(). Hadoop Cluster configuration is : 1 Master node(r3.xlarge) and 1 worker node (m4.xlarge). Here is the spark-submit parameter : ./bin/spark-submit --driver-memory 12g --executor-cores 2 --num-executors 3 --executor-memory 3g --class org.apache.TransformationOper --master yarn-cluster /spark/TransformationOper.jar Spark job entered into running state but it has been executing for last one hour still execution not completed.

priyal · ‎06-11-2018

Hi, @Vinicius Higa Murakami I want to process 4 GB file so I have configured executor memory to 10 gb and number of executors to 10 in spark-env.sh file.Here is the spark-submit parameters : ./bin/spark-submit --class org.apache.TransformationOper --master local[2] /root/spark/TransformationOper.jar /Input/error.log I tried to set configuration manually using below spark-submit parameters : ./bin/spark-submit --driver-memory 5g --num-executors 10 --executor-memory 10g --class org.apache.TransformationOper --master local[2] /root/spark/TransformationOper.jar And set master as a yarn-cluster still got the OutOfMemoryError error.

priyal · ‎06-08-2018

@Jay Kumar SenSharma Thanks

priyal · ‎06-08-2018

Hi, I have created HDP 2.6 on AWS with 1 master node and 4 worker nodes.I am using cluster management tool Ambari. I have configured spark-env.sh file on master node now i want to apply all those setting to all worker nodes on cluster. How to refresh the cluster configuration for reflecting the latest configs to all nodes in the cluster.

priyal · ‎06-08-2018

Hi, I have Created HDP 2.6 on AWS with master node(m4.2xlarge) and 4 worker nodes(m4.xlarge). I want to process 4GB log file using Spark job but i am getting below error while executing Spark Job : Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3236) I have configured spark-env.sh file for master node : SPARK_EXECUTOR_MEMORY="5G" SPARK_DRIVER_MEMORY="5G" but it throws the same error. I also configured worker nodes with those settings and increase Java heap size for hadoop client,Resource Manager,Node Manager and for YARN still spark job aborted. Thanks,

priyal · ‎03-23-2018

@Rahul Soni, Hi, I edited the comment.Please check it.

priyal · ‎03-23-2018

@Rahul Soni, Thanks, Actually its a type mistake.I edited my question and i found that i forgot to close ' ')) '. I want to fetch following values , [/aLog/transaction],POST,[application/vnd.app.v1+json || application/json] I tried below script, extract = FOREACH matched GENERATE FLATTEN(REGEX_EXTRACT_ALL(logmessage,'^(\\S+)\\s+"(\\{(\\S+),.*=(.*),.*=(.*)\\})"+\\s+(\\S+)\\s+(\\S+)\\s+(\\S+)\\s+(\\S+)\\s+(\\S+)\\s+(\\S+).* (t1:chararray,t2:chararray,t3:chararray,t4:chararray,url:chararray,type:chararray,produces:chararray,t5:chararray,t6:chararray,classes:chararray,throw:chararray,exception:chararray); Output : (Mapped,{[/auditConfirmation/businessDates],methods=[GET],produces=[application/vnd.app.v1+json || application/json]},[/auditConfirmation/businessDates],[GET],[application/vnd.app.v1+json || application/json],on to,public,java.lang.String,com.fhlb.controllers.rest.auditconfirmation.AuditConfirmationRestService.getCloseOFBusinessDates(java.lang.String),throws,com.fhlb.commons.CustomException) I fetched the output which i want But i am getting one extra schema.Could you help me with regex which extract only expected output.I want to remove "{[/auditConfirmation/businessDates],methods=[GET],produces=[application/vnd.app.v1+json || application/json]}" from the output. I got the Expected output using below script : output = FOREACH extract GENERATE $4 as url,$5 as requesttype,$6 as produces;

Online	Offline
Last Visited	‎04-03-2019 05:51 AM

Member Since	‎03-31-2017 01:37 PM
Last Visited	‎04-03-2019 05:51 AM
Posts	57
Kudos received	1

Cloudera Community

Re: Fetch stock exchange data from Alpha vantage A...

Fetch stock exchange data from Alpha vantage API u...

Re: Spark job aborted due to java.lang.OutOfMemory...

Re: Spark job aborted due to java.lang.OutOfMemory...

Re: Spark job aborted due to java.lang.OutOfMemory...

Re: How to refresh the cluster configuration for r...

How to refresh the cluster configuration for refle...

Spark job aborted due to java.lang.OutOfMemoryErro...

Re: mismatched input 'AS' expecting RIGHT_PAREN in...

Re: mismatched input 'AS' expecting RIGHT_PAREN in...