Member since 
    
	
		
		
		02-24-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                175
            
            
                Posts
            
        
                56
            
            
                Kudos Received
            
        
                3
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 1924 | 06-16-2017 10:40 AM | |
| 16481 | 05-27-2016 04:06 PM | |
| 1632 | 03-17-2016 01:29 PM | 
			
    
	
		
		
		03-17-2016
	
		
		01:29 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Well found the solution 🙂 posting answer in case if others face this issue in future.  following line was missing:  import sqlContext.implicits._ 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-17-2016
	
		
		01:06 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi guys,
I a csv which contains the contact details (name,email) like:  abc,abc@xyz.com
xyz,xyz@abc.com  I am building a case class and then trying to register the RDD[CaseClass] as data frame following the steps. But endup getting error.  error: value toDF is not a member of org.apache.spark.rdd.RDD[Contact]  case class Contact(name:String,email:String)
val texts = sc.textFile("\pathto\contacts.csv") // doesn't contain headers.
val contacts = texts.map(s =>s.split(",")).map(s=>Contact(s(0),s(1)))
val contactsDF = contacts.toDF()
  Can anyone help me understand what's going wrong  here?  Thanks 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Spark
			
    
	
		
		
		03-15-2016
	
		
		11:45 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Thanks @Neeraj Sabharwal 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-14-2016
	
		
		10:15 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Guys,  We have configured A/D with Ranger for a single domain on the Kerberized cluster. The customer requirement is to setup policies for multiple domains.(Namely 5 domains) How do we achieve this on HDP 2.3 cluster?  Thank you. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Ranger
			
    
	
		
		
		03-02-2016
	
		
		03:15 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Thanks for the document and high level comparison. Do you have cost comparision also if we run HDI on monthly basis for 30 days against same of HDP on Azure with similar configuration? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-02-2016
	
		
		01:32 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I wanted to know couple of things here.  1) Suppose I've few map reduce jobs and they need to be run on the HDI. What I understand from HDI approach, it is for build, run and delete. If I've placed all my jars, oozie jobs, configurations on the cluster and if I delete them today. In future if I want to run the same batch job, do I need to copy all the jars, re configure the oozie jobs?   2) Is it possible to configure Solr run on HDInsights? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-02-2016
	
		
		12:24 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hello experts,  I am building a use case, one of the part is fetching the emails from email server(s) and want to do some analytics on the emails. Given the credentials can it connect to Email Servers and fetch emails and sink them to HDFS? Is this feature available out of the box in HDF ?  What could be alternative or other solution approach to this? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache NiFi
- 
						
							
		
			Cloudera DataFlow (CDF)
			
    
	
		
		
		03-01-2016
	
		
		12:49 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 Hi Guys,  I am trying to Kerberizing the cluster and want to integrate with A/D for user authentication. Earlier I've done it using MIT KDC in the HDP cluster and setting bi-directional trust with A/D. But as I remember, the previous step adds couple of entries in the A/D. However, customer does not want to give write access to the A/D. How to proceed in this scenario?  Thanks,  SS. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Hortonworks Data Platform (HDP)
			
    
	
		
		
		03-01-2016
	
		
		12:30 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi guys,  I've installed 2.3.x HDP version on a single node node for now. (Not sandbox). And this machine was shutdown abruptly. And I am trying to start Ambari I am not finding starting script. Could you please help me locate it? on CentOS 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
- « Previous
- Next »
 
         
					
				













