Member since
10-20-2017
63
Posts
0
Kudos Received
0
Solutions
09-24-2018
11:21 PM
@Raj ji Looks
like this thread is Older than the Other Duplicate Thread link mentioned above
hence posting my update from the other HCC thread so that other thread
can be deleted. If you want to Rotate as well as compress your Logs (like Audit log)
then you can make use of "RollingFileAppender" (instead of using
DailyRollingFileAppender. Because with "RollingFileAppender" you get
more options to rotate the logs based on various policies like
"TimeBasedRollingPolicy" and you can also compress the files "log.gz" Please
refer to the following example for more details:
https://community.hortonworks.com/articles/50058/using-log4j-extras-how-to-rotate-as-well-as-zip-th.html TimeBasedTriggeringPolicy hdfs.audit.logger=WARN,console
log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger}
log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false
log4j.appender.DRFAAUDIT=org.apache.log4j.rolling.RollingFileAppender
log4j.appender.DRFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log
log4j.appender.DRFAAUDIT.layout=org.apache.log4j.PatternLayout
log4j.appender.DRFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
log4j.appender.DRFAAUDIT.DatePattern=.yyyy-MM-dd
log4j.appender.DRFAAUDIT.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.DRFAAUDIT.rollingPolicy.ActiveFileName=${hadoop.log.dir}/${hadoop.log.file}
log4j.appender.DRFAAUDIT.rollingPolicy.FileNamePattern=${hadoop.log.dir}/${hadoop.log.file}-.%d{yyyyMMdd}.log.gz
. Please make sure to copy the "apache-log4j-extras-1.2.17.jar" files inside the /usr/hdp/x.x.x.x.x/hadoop/lib/ directory as mentioned in the above article Followed by restart of all required services. . . Similarly "SizeBasedTriggeringPolicy" can be used as following: hdfs.audit.logger=WARN,console
log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger}
log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false
log4j.appender.DRFAAUDIT=org.apache.log4j.rolling.RollingFileAppender
log4j.appender.DRFAAUDIT.rollingPolicy=org.apache.log4j.rolling.FixedWindowRollingPolicy
log4j.appender.DRFAAUDIT.rollingPolicy.maxIndex=10
log4j.appender.DRFAAUDIT.rollingPolicy.ActiveFileName=${hadoop.log.dir}/hdfs-audit.log
log4j.appender.DRFAAUDIT.rollingPolicy.FileNamePattern=${hadoop.log.dir}/hdfs-audit.log-%i.gz
log4j.appender.DRFAAUDIT.triggeringPolicy=org.apache.log4j.rolling.SizeBasedTriggeringPolicy
log4j.appender.DRFAAUDIT.triggeringPolicy.MaxFileSize=10485760
log4j.appender.DRFAAUDIT.layout=org.apache.log4j.PatternLayout
log4j.appender.DRFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n<br> Please
change the value of "log4j.appender.file.triggeringPolicy.MaxFileSize"
according to your requirement here the value "10485760" is around 10MB. Reference: https://community.hortonworks.com/questions/212567/log4g-logs-not-rotated-and-zipped.html
... View more
09-20-2018
01:56 AM
@Raj ji Consider using PutHBaseRecord processor instead of PutHBaseJson and use UpdateRecord processor to add the CompositePrimaryKey to the json doc. Then use the CompositePrimaryKey field as row identifier for the PutHBaseRecord processor and adjust the BatchSize property value to get max number of put's into HBase.
... View more
09-18-2018
09:09 PM
@Raj ji Yes you can use it.. PutHBaseJson processor:- 1. expects json individual messages(not an array) 2. You need to extract the values for ServerName,ServerNo from the content using EvaluateJsonPath processor then use Row Identifier
${ServerName},${ServerNo} (or) PutHBaseRecord processor: Using Record processor we don't need to split the array of json messages if you are using this processor but we need to prepare the row_id by using Update Record processor by using concat(/ServerName,',',/ServerNo) function. Refer to this link for more details regards to UpdateRecord processor concat function usage.
... View more
08-30-2018
11:33 AM
@Raj ji Check out this solution here: https://community.hortonworks.com/questions/147226/replacetextprocessor-remove-blank-lines.html If this answer is helpful please choose ACCEPT to mark the question resolved.
... View more
03-06-2019
07:07 PM
@Eugene Koifman We are facing an issue which seems to be a limitation of Hive 1.2 ACID tables. We are using MERGE for loading mutable data on Hive ACID tables but loading/Reading these ACID tables using Pig or using Spark seems to be an issue . Does Hive ACID table for Hive version 1.2 posses the capability of being read into Apache Pig using HCatLoader (or other means) or in Spark using SQLContext(or other means). For Spark, it seems it is only possible to read ACID tables if the table is fully compacted i.e no delta folders exist in any partition. Details in the following JIRA https://issues.apache.org/jira/browse/SPARK-15348, https://issues.apache.org/jira/browse/SPARK-15348 However I wanted to know if it is supported at all in Apache Pig to read ACID tables in Hive. When I tried reading both an un-partitoned/partitioned ACID table in Pig version 0.16 I get 0 records read. Successfully read 0 records from: "dwh.acid_table" HDP version 2.6.5 Spark version 2.3 Pig version 0.16 Hive version 1.2
... View more
08-17-2018
09:04 AM
capture1.png@Jay Kumar SenSharma . Everything is configured properly in the Ambari . However it is not working as expected . Please refer attached screenshot . you will understand and not sure why it is not removing the older logs which is more than 5 days as configured capture.png
... View more
07-13-2018
02:16 PM
from Generatetablefetch I'm able to get the flow file . Upon running the flow, the select query failed to execute on DB2. On investigating we found that the query generated by GenerateTableFetch looked like this select userid,timestamp from user11 where timestamp<='01-01-2018 12:00:00' order by timestamp limit 10000 And I have used the Nifi Expression language as per the https://community.hortonworks.com/articles/167733/using-generatetablefetch-against-db2.html
and created a query like this select ${generatetablefetch.columnnames} from ${generatetablefetch.tablename} where ${generatefetchtable.whereClause} order by ${generatetablefetch.maxColumnNames} fetch first ${generatetablefetch.limit} rows only
select userid,timestamp from user11 where timestamp >= '01-01-2018 12:00:00' order by timestamp limit 1000
but i'm getting
select userid,timestamp from user11 where order by timestamp limit 1000 In the above example where condition is not taking the value. please refer the screenshots for my configuration80483-nififlow.png I think I have made halfway through this and stuck here .What is missing in this .
... View more
07-04-2018
08:25 PM
@Raj ji Once make sure the avro schema field name is matching(case sensitive) with the Partition Columns property value specified. If your avro data file having chrono is Captial letters then you need to change the property value according to the avro schema field name. Please refer to this and this links explains about streaming api of Hive.
... View more