Member since 
    
	
		
		
		05-17-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                190
            
            
                Posts
            
        
                46
            
            
                Kudos Received
            
        
                11
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 1737 | 09-07-2017 06:24 PM | |
| 2288 | 02-24-2017 06:33 AM | |
| 3433 | 02-10-2017 09:18 PM | |
| 7956 | 01-11-2017 08:55 PM | |
| 5893 | 12-15-2016 06:16 PM | 
			
    
	
		
		
		01-09-2020
	
		
		07:45 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi Matt,   The case was bit different than in the screenshot. This was a multi node cluster and instead of "localhost" @VijaySankar had one of the hostnames configured in the hostname field. The processor was however configured to run on all nodes. This was causing the Error messages. Cleared off the hostname field so that the processor is able to spin up a HTTP service on each host:port and the error doesn't occur anymore.   
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-09-2018
	
		
		04:22 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Tested against HDF Version 3.1.0 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-09-2018
	
		
		04:11 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 Hi,     In this article, let us take a look at how to delete a schema from the Hortonworks Schema Registry.     Let me start with a word of caution that the approach is not recommended for Production systems and use these steps at your own risk.  Also, would like to thank Brian Goerlitz for his ideas towards this post.     Currently it is not possible to delete a schema from the UI. So the steps below shows how to delete the schema from its backend datastore. I am using MySQL as my backend datastore for the schema registry and the queries will be related to MySQL. You should change them according to your database type.     Step 1  Verify that the two tables schema_version_info and schema_field_info have CASCADE ON UPDATE and CASCADE ON DELETE enabled.     This can be done by the below  queries on information_schema database  select UPDATE_RULE,DELETE_RULE,REFERENCED_TABLE_NAME from REFERENTIAL_CONSTRAINTS where table_name='schema_version_info'; and   select UPDATE_RULE,DELETE_RULE,REFERENCED_TABLE_NAME from REFERENTIAL_CONSTRAINTS where table_name='schema_field_info';        Step 2   Stop Schema Registry Service from Ambari     Step 3   Backup the database  Below is the content of my schema registry before the delete operation and I am interested in deleting the person.demographic.details schema         Step 4  Identify the id of the schema to be deleted.   For this, you need to switch to the database provisioned to store the schema registry information. In my case it is 'registry' and issue the select query.   select id from schema_metadata_info where name ='person.demographic.details';        Step 5   Delete the schema from schema_serdes_mapping based on the id we queried in step 4 above  delete from schema_serdes_mapping where schemaMetadataId=1;  Step 6   Delete the schema from schema_metadata_info based on the id we queried in step 4 above  delete from schema_metadata_info where id =1;      We observe that the schema has been deleted from the tables.     Step 7   Start the schema registry service via Ambari, and verify that the schema is deleted.        Optionally we can recreate the schema with the same name on the UI and explore the front-end and back-end to ensure the schema can be re-created with no issues.           We observe that the new schema was created with the same name and a different id.   Thanks  -Arun A K- 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		10-18-2018
	
		
		01:27 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 It was an access issue on the Buckets. Right permission settings on the bucket fixed it.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-14-2018
	
		
		11:28 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Output Data of the form        
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-14-2018
	
		
		11:19 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 May not be the best approach, but we could do this in a 2 step process.     Step 1  Load the content to a data frame   Apply an UDF to derive a set of period_end_date for the given row   Explode the row based on the period_end_date  Step 2   Derive the period_start_date for the period_end_date based on the pa_start_date    You can either derive end date first and start date next or vice versa.     Below is a code snippet. Can be optimized further   import org.apache.spark.sql.types.{StructType,StructField,StringType,IntegerType};
import org.apache.spark.sql.Row;
import java.util.Date
import scala.collection.mutable.ListBuffer
import java.util.GregorianCalendar
import java.util.Calendar
import java.text.SimpleDateFormat
val csv = sc.textFile("/user/hdfs/ak/spark/197905/")
val rows = csv.map(line => line.split(",").map(_.trim))
val rdd = rows.map(row => Row(row(0),row(1),row(2),row(3),row(4),row(5)))
val schema = new StructType().add(StructField("c0", StringType, true)).add(StructField("c1", StringType, true)).add(StructField("c2", StringType, true)).add(StructField("c3", StringType, true)).add(StructField("c4", StringType, true)).add(StructField("c5", StringType, true))
val df = sqlContext.createDataFrame(rdd, schema)<br>
df.registerTempTable("raw_data");
<br><br>def getLastDateOfMonth(date:Date) : Date ={
        val cal = Calendar.getInstance()
        cal.setTime(date);
        cal.set(Calendar.DAY_OF_MONTH, cal.getActualMaximum(Calendar.DAY_OF_MONTH));
        cal.getTime();
    }
 def getFirstDateOfMonth(date:Date) : Date ={
        val cal = Calendar.getInstance()
        cal.setTime(date);
        cal.set(Calendar.DAY_OF_MONTH, cal.getActualMinimum(Calendar.DAY_OF_MONTH));
        cal.getTime();
    }
  def 
  getLastDaysBetweenDates = (formatString:String, startDateString:String, endDateString:String) => {
    val format = new SimpleDateFormat(formatString)
    val startdate = getLastDateOfMonth(format.parse(startDateString))
    val enddate =getLastDateOfMonth(format.parse(endDateString))
    var dateList = new ListBuffer[Date]()
    var calendar = new GregorianCalendar()
    calendar.setTime(startdate)
    var yearMonth="";
    var maxDates = scala.collection.mutable.Map[String, Date]()
    while (calendar.getTime().before(enddate)) {
      yearMonth = calendar.getTime().getYear()+"_"+calendar.getTime.getMonth()
      maxDates += (yearMonth -> calendar.getTime())
      calendar.add(Calendar.DATE, 1)
    }
    maxDates += (yearMonth -> calendar.getTime())
    for(eachMonth <- maxDates.keySet){
      dateList += maxDates(eachMonth) 
    }
    var dateListString = "";
     for( date <- dateList.sorted){
    dateListString=dateListString+","+format.format(date)
  }
     dateListString.substring(1, dateListString.length())
  }
def 
getFirstDateFromLastDateAndReference = (formatString:String, refDateString:String, lastDate:String)  => {
    val format = new SimpleDateFormat(formatString)
    val firstDay = getFirstDateOfMonth(format.parse(lastDate))
    val year = firstDay.getYear;
    val month = firstDay.getMonth;
    val refDate = format.parse(refDateString)
    val cal = Calendar.getInstance()
    cal.setTime(refDate)
    val refDateTime = cal.getTime();
    val refYear=refDateTime.getYear;
    val refMonth = refDateTime.getMonth();
    if(year==refYear&& month==refMonth){
      refDateString
    }else{
      format.format(firstDay)
    }
 }
  sqlContext.udf.register("lastday",getLastDaysBetweenDates)
sqlContext.udf.register("firstday",getFirstDateFromLastDateAndReference)
  sqlContext.sql("select *,lastday('d-MMM-yy',c4,c5) from raw_data").show();
  sqlContext.sql("select c0,c1,c2,c3,c4,c5,explode(split(lastday('d-MMM-yy',c4,c5),',')) as lastday from hello").registerTempTable("data_with_end_date");
  sqlContext.sql("select c0,c1,c2,c3,c4,c5,lastday,firstday('d-MMM-yy',c4,lastday) from data_with_end_date").show()
  I used 2 udfs here   1) getLastDaysBetweenDates - Consumes a date format, start and end dates and returns a list of Month End Dates in this range  2) getFirstDateFromLastDateAndReference - Consumes a date format, a start date and an end date. Returns the first date of the month based on the last date. However for the first month, it returns the pa_start_date instead of the First Calendar date.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-14-2018
	
		
		12:23 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @AArora, is the requirement to create multiple rows from one row where you need to have all "First & Last Day of the Month"  between pa_start_date   pa_end_date as the period_end_date?  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-13-2018
	
		
		06:26 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 check out https://community.hortonworks.com/answers/77558/view.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-27-2018
	
		
		09:27 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Scott Aslan : Thanks, build successful after skipping the tests. The test failure trace is on the previous comment.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-27-2018
	
		
		08:41 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I will run again skipping the tests,       [INFO]
[INFO] --- maven-remote-resources-plugin:1.5:process (process-resource-bundles) @ nifi-solr-processors ---
[INFO] Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.167 s - in org.apache.nifi.processors.standard.TestParseCEF
[INFO] Running org.apache.nifi.processors.standard.TestGetFile
[ERROR] Tests run: 7, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 0.138 s <<< FAILURE! - in org.apache.nifi.processors.standard.TestGetFile
[ERROR] testWithUnreadableDir(org.apache.nifi.processors.standard.TestGetFile)  Time elapsed: 0.028 s  <<< ERROR!
java.lang.NullPointerException
at org.apache.nifi.processors.standard.TestGetFile.testWithUnreadableDir(TestGetFile.java:92)
[ERROR] testWithInaccessibleDir(org.apache.nifi.processors.standard.TestGetFile)  Time elapsed: 0.006 s  <<< ERROR!
java.lang.NullPointerException
at org.apache.nifi.processors.standard.TestGetFile.testWithInaccessibleDir(TestGetFile.java:64)
[ERROR] testWithUnwritableDir(org.apache.nifi.processors.standard.TestGetFile)  Time elapsed: 0.007 s  <<< ERROR!
java.lang.NullPointerException
at org.apache.nifi.processors.standard.TestGetFile.testWithUnwritableDir(TestGetFile.java:120)
[INFO] Running org.apache.nifi.processors.standard.TestGenerateFlowFile
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 s - in org.apache.nifi.processors.standard.TestGenerateFlowFile
[INFO] Running org.apache.nifi.processors.standard.TestExtractGrok 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
         
					
				













