Member since
03-01-2018
7
Posts
0
Kudos Received
0
Solutions
05-29-2019
05:50 AM
I have few doubt with HDPCD2019 Spark Certification Exam: Please help me with this. 1. Will expected output be provided? 2. Any data (csv or txt) will be given within question only? 3. Clear instructions, where to store output and where to store commands will be given? 4. Why higher screen resolution is required? 5. Everything will be within browser only? or we need to install any supporting software? 6. If everything is in browser only then why higher system configuration (16 Gb RAM) required? 7. What does it mean "Store output of jar run on Yarn"? 8. What type of question will be there for Spark streaming? What about live data feed for streaming? Thanks In Advance.
... View more
Labels:
08-07-2018
05:04 AM
How to dispose or destroy broadcast dataframe?
... View more
Labels:
07-25-2018
07:05 PM
I am working on
large data volume as Spark is meant for.Recently I was facing Executor Lost
exception and it resolved by increasing executor memoryOverhead. Can anyone
help me to understand the internal functionality of memoryOverhead and what
exactly memory given to memoryOverhead is utilized for? I understand the
equation to derive memoryOverhead. But I am still black box to understand which
objects are stored in this memory? Either those objects belongs to User Classes
(Userdefined classes) or Spark own classes? In first attempt,the executors are
getting lost and task is failed while in second attempt the task are completed
successfully.Why this memory is dependent on my data volume? Only
auto-resubmitting is not a solution.Its part of Spark goodness. Also let me
know how to reduce these objects if this is the only problem due to which all
this happened.
... View more
Labels:
04-07-2018
09:08 AM
I am having two different hbase tables in same cluster. Both is having same rowkey format and length and also same column family. Both tables are having large data volume. Now I need one table having data of both the tables (Union of both tables). Merge one table data in another is also fine. Is there any way to achieve this? Either using hbase shell or map reduce command?
... View more
Labels:
03-05-2018
07:42 PM
Hi, I am currently preparing for HDPCD-Spark. I am currently reading books as well as practice also. I have gone through many suggestion for Practice Exam of the same. I created account in AWS for the same. Now I am not able to search "HDPCDeveloper_x.x PracticeExam_vx" in Community-AMI nofound.png. Please let me know, where I can get practice exam. I am ready to purchase paid version of EC2. In case I didn't find practice exam AMI, can I create same environment using launching sandbox in separate EC2 with 16gb RAM?
... View more
03-01-2018
05:34 AM
My system having 8gb RAM and I have allocated whole 8gb. Does it require more or this much is enough? conf.png
... View more
03-01-2018
04:21 AM
I am new to Sendbox. I have installed Virtualbox using virtualbox-4.3_4.3.20-96996~Ubuntu~precise_amd64.deb and also imported HDP_2.6.4_virtualbox_01_02_2018_1428.ova. I am getting error and not able to see ambari-ui. "REASON: Server not yet listening on http port 8080 after 50 seconds. Exiting." Can anyone please help me or let me know if more details required.ambari.png console.png
... View more
Labels: