Member since 
    
	
		
		
		02-16-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                45
            
            
                Posts
            
        
                24
            
            
                Kudos Received
            
        
                2
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 8056 | 07-28-2016 03:37 PM | |
| 11138 | 02-20-2016 11:34 PM | 
			
    
	
		
		
		04-07-2016
	
		
		07:59 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 I decided to use multiple streams instead. life is easier that way. thank u 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-07-2016
	
		
		05:21 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 yes the data is in the same stream. For example, one string will have 6 columns and the second one will have 8 . thank you I will try this see if it is gonna work. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-07-2016
	
		
		05:07 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 thanks, it is really helpful for beginners like me.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-07-2016
	
		
		05:00 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 In my data I have 8 different schemas. I want to create 8 different data frame for them and save them in 8 different tables in hive. So far I created a super bean class which holds shared attributes and each bean class extends it. Based on the type attribute I  created different objects. The problem is I am unable to save them in different data frame. Is there any way I can do that? Here is my code so far, which works fine for one schema.   xmlData.foreachRDD(
    	      new Function2<JavaRDD<String>, Time, Void>() {
    	        public Void call(JavaRDD<String> rdd, Time time) {
    	          HiveContext hiveContext = JavaHiveContext.getInstance(rdd.context());
    	       // Convert RDD[String] to RDD[case class] to DataFrame
    	          JavaRDD<JavaRow> rowRDD = rdd.map(new Function<String, JavaRow>() {
    	            public JavaRow call(String line) throws Exception{
    	            	String[] fields = line.split("\\|");
			//JavaRow is my super class
    	            	JavaRow record = null;
    		 		   if(fields[2].trim().equalsIgnoreCase("CDR")){
    		 			  record = new GPRSClass(fields[0], fields[1]);
    		 		   }
    		 	if(fields[2].trim().equalsIgnoreCase("Activation")){
    	 record = new GbPdpContextActivation(fields[0], fields[1], fields[2], fields[3]); }
    	              return record;}});
    	           DataFrame df;
    	          df = hiveContext.createDataFrame(rowRDD, JavaRow.class);
    	          df.toDF().registerTempTable("Consumer");
    	          System.out.println(df.count()+" ************Record  Recived************");
    	          df = hiveContext.createDataFrame(rowRDD, GPRSClass.class);
     	          hiveContext.sql("CREATE  TABLE if not exists gprs_data ( processor string, fileName string, type string, version string, id string )STORED AS ORC ");
df.save("/apps/hive/warehouse/data", "org.apache.spark.sql.hive.orc",SaveMode.Append);
 }
      return null; } });
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Spark
 
			
    
	
		
		
		04-01-2016
	
		
		08:48 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							  Hue version has spark-submit. So there is not any way to do it in Huw 2.6? @Divakar Annapureddy 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-01-2016
	
		
		08:25 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 I am new in Oozie. I am using Hue 2.6.1-2950 and Oozie 4.2. I develop a spark program in java which gets the data from kafka topic and save them in hive table.  I pass my arguments to my .ksh script to submit the job. It works perfect however, I have no idea  how to schedule this using oozie and hue to run every 5 minutes. I have a jar file which is my java code, I have a consumer.ksh which gets the arguments from my configuration file and run my jar file using spark-submit command. Please give me suggestion how to this. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Oozie
 - 
						
							
		
			Apache Spark
 
			
    
	
		
		
		04-01-2016
	
		
		06:13 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 I am new in oozie. I have a java program which produce data into kafka topic(it is not map reduce job). I am trying to schedule it with ozzie. How ever, I am getting this error:  JA009: Could not load history file hdfs://sandbox.hortonworks.com:8020/mr-history/tmp/hue/job_1459358290769_0012-1459533575025-hue-oozie%3Alauncher%3AT%3Djava%3AW%3DData+Producer%3AA%3DproduceDat-1459533591693-1-0-SUCCEEDED-default-1459533581542.jhist at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.loadFullHistoryData(CompletedJob.java:349) at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.<init>(CompletedJob.java:101) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo.  I read it can be permission or owner problem so, I changed the owner to mapred and give 777 permission. But I still I get the same error. I am using java action to schedule my jar file. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Oozie
 - 
						
							
		
			Apache Spark
 
			
    
	
		
		
		03-25-2016
	
		
		03:45 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Brandon Wilson I tried your suggestion it creates the hive table but I get this error:  org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table.  and it does not load data into my table. do you have any idea how to solve this? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-23-2016
	
		
		02:30 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Benjamin Leonhardi Thank you for your response. Based on your suggestion, I have to apply mapPartitions method on my JavaDStream . That method will return another JavaDStream to me. I cannot use saveAsTextFile() on the JavaDStream so I have to do foreachRDD to be able to do saveAsTextFile. Therefore, I will have the same problem again. correct me if I am wrong because I am new in spark.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-21-2016
	
		
		07:59 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 @Benjamin Leonhardi Do you have any sample code for java which use the mapPartition instead of foreachRDD   ?  
						
					
					... View more