Member since
01-17-2016
42
Posts
50
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3408 | 04-21-2016 10:41 PM | |
953 | 04-15-2016 03:22 AM | |
1365 | 04-13-2016 04:03 PM | |
4354 | 04-12-2016 01:59 PM |
02-20-2020
09:15 AM
@BI_Gabor,
Yes, this thread is older and was marked 'Solved' in April of 2016; you would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your question that could aid others in providing a more accurate answer.
... View more
01-17-2017
06:43 PM
7 Kudos
If you have ever tried to spawn multiple cloudbreak shells you may have run into an error. That is because the default "cbd util cloudbreak-shell" uses docker containers. The fastest work around of this is to use the Jars directly. These Jars can be remotely run from your personal machine or run on the cloudbreak machine itself. Prepping the cloudbreak machine(only needed if running jars locally on the AWS image) Log into your cloudbreak instance and go to /etc/yum.repos.d Remove the Centos-Base.repo file (this is a redhat machine and this can cause conflicts) Install java-8 (yum install java-1.8.0*) Change directory back to /home/cloudbreak Downloading the Jar Set a global variable equal to your cloudbreak version (export CB_SHELL_VERSION=1.6.1) Download the jar (curl -o cloudbreak-shell.jar https://s3-eu-west-1.amazonaws.com/maven.sequenceiq.com/releases/com/sequenceiq/cloudbreak-shell/$CB_SHELL_VERSION/cloudbreak-shell-$CB_SHELL_VERSION.jar) Using the Jar Interactive mode (java -jar ./cloudbreak-shell.jar --cloudbreak.address=https://<your-public-hostname> --sequenceiq.user=admin@example.com --sequenceiq.password=cloudbreak --cert.validation=false) Using a command file (java -jar ./cloudbreak-shell.jar --cloudbreak.address=https://<your-public-hostname> --sequenceiq.user=admin@example.com --sequenceiq.password=cloudbreak --cert.validation=false --cmdfile=<your-FILE>)
... View more
Labels:
11-09-2016
08:58 PM
Was the HTML content actually valid XML? Did the content viewer open? If so, is the content viewer unable to show the content in it's 'formatted' form? What about the 'original' or 'hex' forms?
... View more
10-02-2016
06:21 PM
@Chris Gambino Currently you can not create autoscaling policies with CLI but here is the link for the rest api on the hosted Cloudbreak: https://cloudbreak.sequenceiq.com/as/api/index.html Br, Richard
... View more
10-03-2016
06:32 AM
Yes, the 50gb fixed root disk is for OS and RPMs, configs etc., but not for HDFS.
... View more
09-15-2016
11:28 AM
6 Kudos
In this article I will review the steps required to enrich and filter logs. It is assumed that the logs are landing one at a time as a stream into the nifi cluster. The steps involved
Extract Attributes - IP and Action Extract Attributes - IP and Action Cold Store non ip logs GeoEnrich the IP address Cold store local IP addresses Route the remaining logs based on threat level Store the low threat logs in HDFS Place high threat logs into an external table Extract IP Address and Action - ExtractText Processor This processor will evaluate each log and parse the information into attributes. To create a new attribute add a property and give it a name(soon to be attribute name) and a java-style regex command. As the processor runs it will evaluate the regex and create an attribute with the result.
If there is no match it will be sent to the 'unmatched' result which is a simple way of filtering out different logs. GeoEnrichIP - GeoEnrichIP Processor This processor takes the ipaddr attribute generated in the previous step and compares it to a geo-database('mmdb'). I am using the GeoLite - City Database found here Route on Threat - RouteOnAttribute Processor This processor takes the IsDenied attribute from the previous step and tests to see if it is there. This will only exist if the "Extract IP Address" Processor found "iptables denied" in the log. It is then routed to a connectionw ith that property's name. More properties can be added with thier own rules following the nifi expression language
Note I plan on adding location filtering but did not want to obscure the demo in too many steps. Cold and Medium Storage - Processor Groups These two processor groups are very similar in function. Eventually they could be combined into one shared group using attributes for rules but for now they are separate. Merge Content - This processor takes each individual line and combines them into a larger aggregated file. This helps avoid the too many small files problem that arises in large clusters Compress Content - Simply saves disk space by compressing them Set Filename As Timestamp - UpdateAttribute Processor - This takes each aggragate and sets the attribute 'filename' to the current time. This will allow us to sort the aggregates by when they were written for later review PutHDFS Processor - Takes the aggregate and saves it to HDFS High Threat - Processor Group In order to be read by a hive external table we need to convert the data to a JSON format and save it to the correct directory. Rename Attributes - UpdateAttribute Processor - This renames the fields to match the hive field format Put Into JSON - AttributesToJSON - Takes the renamed fields and saves them in a JSON string that the hive SerDe can read natively Set Filename As Timestamp - UpdateAttribute Processor - Once again this sets the filename to the timestamp. This may be better served as systemname + timestamp moving forward PutHDFS - Stores the data to the hive external file location Hive Table Query Using the ambari hive view I am able to now query my logs and use sql-style queries to get results CREATE TABLE `securitylogs`( `ctime` varchar(255) COMMENT 'from deserializer', `country` varchar(255) COMMENT 'from deserializer', `city` varchar(255) COMMENT 'from deserializer', `ipaddr` varchar(255) COMMENT 'from deserializer', `fullbody` varchar(5000) COMMENT 'from deserializer') ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'hdfs://sandbox.hortonworks.com:8020/user/nifi/High_Threat'
... View more
Labels:
02-27-2019
09:37 AM
Thanks Chris, it is also my use case for NiFi. Can you provide a download link for the template? Thanks.
... View more
08-26-2016
08:31 PM
I write the inner path in mounted directory. Then I start and this time it wrote xml's to the HDFS. So it seems that Recurse Subdirectories property is not working. Could ypu use recurse subdirectories property correctly? Still can't achieve to write all xml's in all sub directories automatically!
... View more
05-06-2016
10:37 AM
Hi @Mike Vogt, thanks and glad to hear it worked. Can you kindly accept the answer and thus help us managing answered questions. Tnx!
... View more
04-22-2016
02:33 PM
@Francis Apel Awesome! Glad to hear it and thanks for letting everyone know it worked.
... View more