Member since
05-17-2016
190
Posts
46
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1371 | 09-07-2017 06:24 PM | |
1777 | 02-24-2017 06:33 AM | |
2544 | 02-10-2017 09:18 PM | |
7045 | 01-11-2017 08:55 PM | |
4656 | 12-15-2016 06:16 PM |
01-09-2020
07:45 AM
Hi Matt, The case was bit different than in the screenshot. This was a multi node cluster and instead of "localhost" @VijaySankar had one of the hostnames configured in the hostname field. The processor was however configured to run on all nodes. This was causing the Error messages. Cleared off the hostname field so that the processor is able to spin up a HTTP service on each host:port and the error doesn't occur anymore.
... View more
11-09-2018
04:22 AM
Tested against HDF Version 3.1.0
... View more
11-09-2018
04:11 AM
2 Kudos
Hi, In this article, let us take a look at how to delete a schema from the Hortonworks Schema Registry. Let me start with a word of caution that the approach is not recommended for Production systems and use these steps at your own risk. Also, would like to thank Brian Goerlitz for his ideas towards this post. Currently it is not possible to delete a schema from the UI. So the steps below shows how to delete the schema from its backend datastore. I am using MySQL as my backend datastore for the schema registry and the queries will be related to MySQL. You should change them according to your database type. Step 1 Verify that the two tables schema_version_info and schema_field_info have CASCADE ON UPDATE and CASCADE ON DELETE enabled. This can be done by the below queries on information_schema database select UPDATE_RULE,DELETE_RULE,REFERENCED_TABLE_NAME from REFERENTIAL_CONSTRAINTS where table_name='schema_version_info'; and select UPDATE_RULE,DELETE_RULE,REFERENCED_TABLE_NAME from REFERENTIAL_CONSTRAINTS where table_name='schema_field_info'; Step 2 Stop Schema Registry Service from Ambari Step 3 Backup the database Below is the content of my schema registry before the delete operation and I am interested in deleting the person.demographic.details schema Step 4 Identify the id of the schema to be deleted. For this, you need to switch to the database provisioned to store the schema registry information. In my case it is 'registry' and issue the select query. select id from schema_metadata_info where name ='person.demographic.details'; Step 5 Delete the schema from schema_serdes_mapping based on the id we queried in step 4 above delete from schema_serdes_mapping where schemaMetadataId=1; Step 6 Delete the schema from schema_metadata_info based on the id we queried in step 4 above delete from schema_metadata_info where id =1; We observe that the schema has been deleted from the tables. Step 7 Start the schema registry service via Ambari, and verify that the schema is deleted. Optionally we can recreate the schema with the same name on the UI and explore the front-end and back-end to ensure the schema can be re-created with no issues. We observe that the new schema was created with the same name and a different id. Thanks -Arun A K-
... View more
Labels:
10-18-2018
01:27 AM
It was an access issue on the Buckets. Right permission settings on the bucket fixed it.
... View more
06-14-2018
11:28 PM
Output Data of the form
... View more
06-14-2018
11:19 PM
May not be the best approach, but we could do this in a 2 step process. Step 1 Load the content to a data frame Apply an UDF to derive a set of period_end_date for the given row Explode the row based on the period_end_date Step 2 Derive the period_start_date for the period_end_date based on the pa_start_date You can either derive end date first and start date next or vice versa. Below is a code snippet. Can be optimized further import org.apache.spark.sql.types.{StructType,StructField,StringType,IntegerType};
import org.apache.spark.sql.Row;
import java.util.Date
import scala.collection.mutable.ListBuffer
import java.util.GregorianCalendar
import java.util.Calendar
import java.text.SimpleDateFormat
val csv = sc.textFile("/user/hdfs/ak/spark/197905/")
val rows = csv.map(line => line.split(",").map(_.trim))
val rdd = rows.map(row => Row(row(0),row(1),row(2),row(3),row(4),row(5)))
val schema = new StructType().add(StructField("c0", StringType, true)).add(StructField("c1", StringType, true)).add(StructField("c2", StringType, true)).add(StructField("c3", StringType, true)).add(StructField("c4", StringType, true)).add(StructField("c5", StringType, true))
val df = sqlContext.createDataFrame(rdd, schema)<br>
df.registerTempTable("raw_data");
<br><br>def getLastDateOfMonth(date:Date) : Date ={
val cal = Calendar.getInstance()
cal.setTime(date);
cal.set(Calendar.DAY_OF_MONTH, cal.getActualMaximum(Calendar.DAY_OF_MONTH));
cal.getTime();
}
def getFirstDateOfMonth(date:Date) : Date ={
val cal = Calendar.getInstance()
cal.setTime(date);
cal.set(Calendar.DAY_OF_MONTH, cal.getActualMinimum(Calendar.DAY_OF_MONTH));
cal.getTime();
}
def
getLastDaysBetweenDates = (formatString:String, startDateString:String, endDateString:String) => {
val format = new SimpleDateFormat(formatString)
val startdate = getLastDateOfMonth(format.parse(startDateString))
val enddate =getLastDateOfMonth(format.parse(endDateString))
var dateList = new ListBuffer[Date]()
var calendar = new GregorianCalendar()
calendar.setTime(startdate)
var yearMonth="";
var maxDates = scala.collection.mutable.Map[String, Date]()
while (calendar.getTime().before(enddate)) {
yearMonth = calendar.getTime().getYear()+"_"+calendar.getTime.getMonth()
maxDates += (yearMonth -> calendar.getTime())
calendar.add(Calendar.DATE, 1)
}
maxDates += (yearMonth -> calendar.getTime())
for(eachMonth <- maxDates.keySet){
dateList += maxDates(eachMonth)
}
var dateListString = "";
for( date <- dateList.sorted){
dateListString=dateListString+","+format.format(date)
}
dateListString.substring(1, dateListString.length())
}
def
getFirstDateFromLastDateAndReference = (formatString:String, refDateString:String, lastDate:String) => {
val format = new SimpleDateFormat(formatString)
val firstDay = getFirstDateOfMonth(format.parse(lastDate))
val year = firstDay.getYear;
val month = firstDay.getMonth;
val refDate = format.parse(refDateString)
val cal = Calendar.getInstance()
cal.setTime(refDate)
val refDateTime = cal.getTime();
val refYear=refDateTime.getYear;
val refMonth = refDateTime.getMonth();
if(year==refYear&& month==refMonth){
refDateString
}else{
format.format(firstDay)
}
}
sqlContext.udf.register("lastday",getLastDaysBetweenDates)
sqlContext.udf.register("firstday",getFirstDateFromLastDateAndReference)
sqlContext.sql("select *,lastday('d-MMM-yy',c4,c5) from raw_data").show();
sqlContext.sql("select c0,c1,c2,c3,c4,c5,explode(split(lastday('d-MMM-yy',c4,c5),',')) as lastday from hello").registerTempTable("data_with_end_date");
sqlContext.sql("select c0,c1,c2,c3,c4,c5,lastday,firstday('d-MMM-yy',c4,lastday) from data_with_end_date").show()
I used 2 udfs here 1) getLastDaysBetweenDates - Consumes a date format, start and end dates and returns a list of Month End Dates in this range 2) getFirstDateFromLastDateAndReference - Consumes a date format, a start date and an end date. Returns the first date of the month based on the last date. However for the first month, it returns the pa_start_date instead of the First Calendar date.
... View more
06-14-2018
12:23 PM
@AArora, is the requirement to create multiple rows from one row where you need to have all "First & Last Day of the Month" between pa_start_date pa_end_date as the period_end_date?
... View more
06-13-2018
06:26 PM
check out https://community.hortonworks.com/answers/77558/view.html
... View more
03-27-2018
09:27 PM
@Scott Aslan : Thanks, build successful after skipping the tests. The test failure trace is on the previous comment.
... View more
03-27-2018
08:41 PM
I will run again skipping the tests, [INFO]
[INFO] --- maven-remote-resources-plugin:1.5:process (process-resource-bundles) @ nifi-solr-processors ---
[INFO] Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.167 s - in org.apache.nifi.processors.standard.TestParseCEF
[INFO] Running org.apache.nifi.processors.standard.TestGetFile
[ERROR] Tests run: 7, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 0.138 s <<< FAILURE! - in org.apache.nifi.processors.standard.TestGetFile
[ERROR] testWithUnreadableDir(org.apache.nifi.processors.standard.TestGetFile) Time elapsed: 0.028 s <<< ERROR!
java.lang.NullPointerException
at org.apache.nifi.processors.standard.TestGetFile.testWithUnreadableDir(TestGetFile.java:92)
[ERROR] testWithInaccessibleDir(org.apache.nifi.processors.standard.TestGetFile) Time elapsed: 0.006 s <<< ERROR!
java.lang.NullPointerException
at org.apache.nifi.processors.standard.TestGetFile.testWithInaccessibleDir(TestGetFile.java:64)
[ERROR] testWithUnwritableDir(org.apache.nifi.processors.standard.TestGetFile) Time elapsed: 0.007 s <<< ERROR!
java.lang.NullPointerException
at org.apache.nifi.processors.standard.TestGetFile.testWithUnwritableDir(TestGetFile.java:120)
[INFO] Running org.apache.nifi.processors.standard.TestGenerateFlowFile
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 s - in org.apache.nifi.processors.standard.TestGenerateFlowFile
[INFO] Running org.apache.nifi.processors.standard.TestExtractGrok
... View more