Member since 
    
	
		
		
		02-27-2017
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                171
            
            
                Posts
            
        
                9
            
            
                Kudos Received
            
        
                0
            
            
                Solutions
            
        
			
    
	
		
		
		08-11-2017
	
		
		02:03 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							   I think you've got the point, the dfs.datanode.dir in Ambari is the global setting, so it will assume every host would have these dirs(/grid/data1, /grid/data2, /grid/data3), in your case you need to create config group to suite your environment.    And there are two way to solve the existing data under your directory, but first let's increase the dfs.datanode.balance.bandwidthPerSec value(bytes/sec) compare to your network speed in HDFS setting by Ambari UI, this will help to speed up the progress.    The safe way is to decommission DataNodes and reconfig your group setting then recommission the node one by one    https://community.hortonworks.com/articles/69364/decommission-and-reconfigure-data-node-disks.html   the unsafe is to reconfig the setting and remove the directory directly based on your replication setting, then wait for replicate complete by check hdfs dfsadmin -report command's under replicated blocks value to 0. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-26-2017
	
		
		05:47 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@rahul gulati I don't think we can handle \n characters with serde RegedSerDe, as by default all '\n' are retreated as line delimiters by Hive. You might need to handle new line using Omniture Data SerDe, refer link for details. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-14-2017
	
		
		07:08 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 hi @rahul gulati,  Apparently, number of
partitions for your DataFrame / RDD is creating the issue.  This can be controlled by adjusting
the spark.default.parallelism parameter in spark context or by using
.repartition(<desired number>)  When you run in spark-shell
please check the mode and number of cores allocated for the execution and
adjust the value to which ever is working for the shell mode  Alternatively you can observe
the same form Spark UI and come to a conclusion on partitions.   # from spark website on spark.default.parallelism  For distributed shuffle
operations like reduceByKey and join, the largest number of
partitions in a parent RDD. For operations like parallelize with no
parent RDDs,   it
depends on the cluster manager:  
 
 Local mode: number of cores on the local machine  
 Others: total number of cores on all executor
     nodes or 2, whichever is larger  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-10-2017
	
		
		03:31 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Using HDP 2.5 with Spark 2 If you defind the code as follows:    val spark = SparkSession
     .builder
     .appName("my app")
     .getOrCreate()     import spark.implicits._     val test = spark.sqlContext.sql("select max (test_dt) as test_dt from abc").as[String]     val test1 = spark.sqlContext.table("testing")   The following two statements will compile     val output2 = test1.filter(test1("audit_date").gt(test).toString())     val output2 = test1.filter(test1("audit_date").gt(test))  of course you can always convert test to String and use the variable in the filter clause. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-05-2017
	
		
		05:28 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @rahul gulati
 Yes, it's okay to install MIT KDC on Ambari server node. But in the 
real production cluster, we should clearly separate these two roles on 
two different nodes.  Hope this helps ! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-11-2017
	
		
		04:58 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hello @rahul gulati,  It's not a good community practice to ask similar question mutliple times. I've provided answer in the comments here, please check. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-03-2017
	
		
		11:53 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @rahul gulati this is how I connect to hive via knox through beeline:  beeline --silent=true -u "jdbc:hive2://<knox_host>:8443/;ssl=true;sslTrustStore=/usr/hdp/current/knox-server/data/security/keystores/gateway.jks;trustStorePassword=knoxsecret;transportMode=http;httpPath=gateway/default/hive;hive.server2.use.SSL=true" -d org.apache.hive.jdbc.HiveDriver -n sam -p sam-password  and there are few references too:  https://cwiki.apache.org/confluence/display/KNOX/Examples+Hive  https://community.hortonworks.com/questions/16887/beeline-connect-via-knox-ssl-issue.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-12-2019
	
		
		07:02 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Priyanka    This is a closed thread  (2017) can you open a new one and copy past this content. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-21-2019
	
		
		04:58 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Is there some example to create Hive table to show up files of the hdfs_path entity? Thank you. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-05-2018
	
		
		12:05 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Artem Ervits
	   > Both can add and remove instances as well as provision new instances with new machine type easily.  
	Could you please point where that option could be located in the UI or CLI of Cloudbreak? Thank you! 
  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













