Member since 
    
	
		
		
		10-19-2015
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                18
            
            
                Posts
            
        
                3
            
            
                Kudos Received
            
        
                0
            
            
                Solutions
            
        
			
    
	
		
		
		01-22-2019
	
		
		04:27 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 This is old question, but just thought of replying  you can do df.groupBY().pivot("pivotcolname).agg(...)  Notice that groypBy clause is empty  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-29-2018
	
		
		07:20 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 No responses ? Does  that mean it is not possible or there is something very obvious that I am missing 🙂 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-29-2018
	
		
		02:06 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hello, I am struggling to find suitable APIs to process multiple data frames in parallel. My requirement is the following-  I have 10s of distinct spark data frames. A certain set of operations must be performed on each DF ( treating each as a single partition), and some results must be returned from each processing. Ex:    Apply func1, func2, func3 to DF1, DF2 and DF3, return list1, list2 and list3 from each.  So, in theory, func1, func2 and func3 can be run in parallel. Wondering if there is any pyspark pattern I can follow.  Thanks ! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Spark
 
			
    
	
		
		
		10-21-2015
	
		
		01:38 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks . That got me a bit closer.     I discovered that there was no core created on solrserver2, which i just did and restarted  both servers.     now i am getting Node: solrserver2:8983_solr is not live !     but i see it started in the manager , and no logs either. What am I missing !!     Your help probably saved couple days of reading and frustration ! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-21-2015
	
		
		11:26 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I tried ADD REPLICA, but I get error message that specified shard does not exist.    I have a CDH 5.4 cluster, with two solr roles    solrserver1  solrserver2    on solrserver1 I have    1- collecitonname : mycolleciton , instaceDir : /var/lib/solr/mycollection_shard1_replica1/    I want to replicate it on solrserver2, so I tried the command      http://solrserver1:8983/solr/admin/collections?action=ADDREPLICA&collection=mycollection&shard=mycollection_shard1_replica1&node=solrserver2:8983_solr    and I got the message    Collection: mycollection shard: mycollection_shard1_replica1 does not exist     Really appreciate your help          
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-19-2015
	
		
		11:54 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Can please give me steps to add replicas of an existing shard ?   Thanks ! 
						
					
					... View more