1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2487 | 04-03-2024 06:39 AM | |
| 3845 | 01-12-2024 08:19 AM | |
| 2080 | 12-07-2023 01:49 PM | |
| 3067 | 08-02-2023 07:30 AM | |
| 4202 | 03-29-2023 01:22 PM |
02-04-2019
06:56 PM
4 Kudos
Log Log Log Sudo logs have a lot of useful information on hosts, users and auditable actions that may be useful for cybersecurity, capacity planning, user tracking, data lake population, user management and general security. Symbol Model 1 Step 1 - Get a File Step 2 - Split Into Lines Step 3 - Set the Mime Type to Plain Text Step 4 - Extract Grok Step 5 - Action! Checking for More Options (All Named elements in the GrokPatterns) Example Sudo Log (could be /auth.log, /var/log/sudo.log, secure, ...) Jan 31 19:17:20 princeton0 su: pam_unix(su-l:session): session opened for user ambari-qa by (uid=0)
Jan 31 19:17:20 princeton0 su: pam_unix(su-l:session): session closed for user ambari-qa
Jan 31 19:18:19 princeton0 su: pam_unix(su-l:session): session opened for user zeppelin by (uid=0)
Jan 31 19:18:19 princeton0 su: pam_unix(su-l:session): session closed for user zeppelin
Jan 31 19:18:20 princeton0 su: pam_unix(su-l:session): session opened for user ambari-qa by (uid=0)
Jan 31 19:18:20 princeton0 su: pam_unix(su-l:session): session closed for user ambari-qa Grok Patterns SUDO_TTY TTY=%{NOTSPACE:sudo_tty}
SUDO_PWD PWD=%{DATA:sudo_pwd}
SUDO_COMMAND COMMAND=%{DATA:sudo_command}
SUDO_USER %{NOTSPACE:sudo_user}
SUDO_RUNAS USER=%{SUDO_USER:sudo_runas}
SUDO_REMOVE_SESSION %{SYSLOGTIMESTAMP:timestamp8} %{NOTSPACE:hostname8} %{NOTSPACE:appcaller} \[%{NOTSPACE:pid7}\]: %{GREEDYDATA:sessionremoval}
SUDO_INFO_COMMAND_SUCCESSFUL %{SUDO_USER:sudo_user2} : %{SUDO_TTY:sudo_tty2} ; %{SUDO_PWD:sudo_pwd2} ; %{SUDO_RUNAS:sudo_runas2} ; %{SUDO_COMMAND:sudo_command2}
SUDO_INFO_PAM_UNIX_SESSION_OPENED pam_unix\(%{NOTSPACE:user1}:session\): session opened for user %{NOTSPACE:sudo_runas3} by %{SUDO_USER:sudo_user3}\(uid=%{NUMBER:uid3}\)
SUDO_INFO_PAM_UNIX_SESSION_CLOSED pam_unix\(%{NOTSPACE:user4}:session\): session closed for user %{NOTSPACE:sudo_runas4}
SUDO_PAM_OPEN2 %{SYSLOGTIMESTAMP:timestamp8} %{NOTSPACE:hostname8} %{NOTSPACE:appcaller}: pam_unix\(%{NOTSPACE:user1}:session\): session opened for user %{NOTSPACE:sudo_runas81} by \(uid=%{NUMBER:uid81}\)
SUDO_SEAT %{SYSLOGTIMESTAMP:timestamp77} %{NOTSPACE:hostname77} %{NOTSPACE:appcaller77}\[%{NOTSPACE:pid77}\]: %{GREEDYDATA:message77}
SUDO_INFO %{SUDO_INFO_COMMAND_SUCCESSFUL:cmdsuccess}|%{SUDO_INFO_PAM_UNIX_SESSION_OPENED:pam_opened}|%{SUDO_INFO_PAM_UNIX_SESSION_CLOSED:pam_closed}
SUDO_ERROR_INCORRECT_PASSWORD_ATTEMPTS %{SUDO_USER} : %{NUMBER} incorrect password attempts ; %{SUDO_TTY:sudo_tty5} ; %{SUDO_PWD:sudo_pwd5} ; %{SUDO_RUNAS:sudo_runas5} ; %{SUDO_COMMAND:sudo_cmd5}
SUDO_ERROR_FAILED_TO_GET_PASSWORD %{NOTSPACE:person6} failed to get password: %{NOTSPACE:autherror6} authentication error
SUDO_PUBLICKEY %{SYSLOGTIMESTAMP:timestamp7} %{NOTSPACE:hostname7} sshd\[%{NOTSPACE:pid7}\]: Accepted publickey for %{NOTSPACE:username} from %{NOTSPACE:sourceip} port %{NOTSPACE:port} ssh2: RSA %{NOTSPACE:rsakey}
SUDO_OPEN_PAM %{SYSLOGTIMESTAMP:timestamp8} %{NOTSPACE:hostname8} %{NOTSPACE:appcaller}\[%{NOTSPACE:pid8}\]: pam_unix\(%{NOTSPACE:user1}:session\): session opened for user %{NOTSPACE:sudo_runas} by \(uid=%{NUMBER:uid}\)
SYSLOGBASE2 (?:%{SYSLOGTIMESTAMP:timestamp9}|%{TIMESTAMP_ISO8601:timestamp8601}) (?:%{SYSLOGFACILITY} )?%{SYSLOGHOST:logsource} %{SYSLOGPROG}:
SYSLOGPAMSESSION %{SYSLOGBASE} (?=%{GREEDYDATA:message})%{WORD:pam_module}\(%{DATA:pam_caller}\): session %{WORD:pam_session_state} for user %{USERNAME:username}(?: by %{NOTSPACE:pam_by})?
CRON_ACTION [A-Z ]+
CRONLOG %{SYSLOGBASE} \(%{USER:user9}\) %{CRON_ACTION:action9} \(%{DATA:message9}\)
SYSLOGLINE %{SYSLOGBASE2} %{GREEDYDATA:message10}
SUDO_ERROR %{SUDO_ERROR_FAILED_TO_GET_PASSWORD}|%{SUDO_ERROR_INCORRECT_PASSWORD_ATTEMPTS}
GREEDYMULTILINE (.|\n)*
AUTH1 %{SYSLOGTIMESTAMP:systemauthtimestamp} %{SYSLOGHOST:systemauthhostname11} sshd(?:\[%{POSINT:systemauthpid11}\])?: %{DATA:systemauthsshevent} %{DATA:systemauthsshmethod} for (invalid user )?%{DATA:systemauthuser} from %{IPORHOST:systemauthsship} port %{NUMBER:systemauthsshport} ssh2(: %{GREEDYDATA:systemauthsshsignature})?
AUTH2 %{SYSLOGTIMESTAMP:systemauthtimestamp} %{SYSLOGHOST:systemauthhostname12} sshd(?:\[%{POSINT:systemauthpid12}\])?: %{DATA:systemauthsshevent} user %{DATA:systemauthuser} from %{IPORHOST:systemauthsship}
AUTH3 %{SYSLOGTIMESTAMP:systemauthtimestamp} %{SYSLOGHOST:systemauthhostname14} sshd(?:\[%{POSINT:systemauthpid13}\])?: Did not receive identification string from %{IPORHOST:systemauthsshdroppedip}
AUTH4 %{SYSLOGTIMESTAMP:systemauthtimestamp} %{SYSLOGHOST:systemauthhostname15} sudo(?:\[%{POSINT:systemauthpid14}\])?: \s*%{DATA:systemauthuser} :( %{DATA:systemauthsudoerror} ;)? TTY=%{DATA:systemauthsudotty} ; PWD=%{DATA:systemauthsudopwd} ; USER=%{DATA:systemauthsudouser} ; COMMAND=%{GREEDYDATA:systemauthosudocmd}
AUTH5 %{SYSLOGTIMESTAMP:systemauthtimestamp} %{SYSLOGHOST:systemauthhostname16} groupadd(?:\[%{POSINT:systemauthpid15}\])?: new group: name=%{DATA:systemauthgroupaddname}, GID=%{NUMBER:systemauthgroupaddgid}
AUTH6 %{SYSLOGTIMESTAMP:systemauthtimestamp} %{SYSLOGHOST:systemauthhostname17} useradd(?:\[%{POSINT:systemauthpid16}\])?: new user: name=%{DATA:systemauthuseraddname}, UID=%{NUMBER:systemauthuseradduid}, GID=%{NUMBER:systemauthuseraddgid}, home=%{DATA:systemauthuseraddhome}, shell=%{DATA:systemauthuseraddshell}$
AUTH7 %{SYSLOGTIMESTAMP:systemauthtimestamp} %{SYSLOGHOST:systemauthhostname18} %{DATA:systemauthprogram17}(?:\[%{POSINT:systemauthpid17}\])?: %{GREEDYMULTILINE:systemauthmessage}"] }
AUTH_LOG %{AUTH1}|%{AUTH2}|%{AUTH3}|%{AUTH4}|%{AUTH5}|%{AUTH6}|%{AUTH7}
SU \+\s+%{DATA:su_tty19}\s+%{USER:su_user19}:%{USER:su_targetuser19}
SSH_AUTHFAIL_WRONGUSER Failed %{WORD:ssh_authmethod} for invalid user %{USERNAME:ssh_user} from %{IP:ssh_client_ip} port %{NUMBER:ssh_client_port} %{GREEDYDATA:message}
SSH_AUTHFAIL_WRONGCREDS Failed %{WORD:ssh_authmethod} for %{USERNAME:ssh_user} from %{IP:ssh_client_ip} port %{NUMBER:ssh_client_port} %{GREEDYDATA:message}
SSH_AUTH_SUCCESS Accepted %{WORD:ssh_authmethod} for %{USERNAME:ssh_user} from %{IP:ssh_client_ip} port %{NUMBER:ssh_client_port} %{WORD:ssh_x} %{WORD:ssh_pubkey_type} %{GREEDYDATA:ssh_pubkey_fingerprint}
SSH_DISCONNECT Received disconnect from %{IP:ssh_client_ip} port %{INT:ssh_client_port}.*?:\s+%{GREEDYDATA:ssh_disconnect_reason}
SSH %{SSH_DISCONNECT}|%{SSH_AUTH_SUCCESS}|%{SSH_AUTHFAIL_WRONGUSER}|%{SSH_AUTHFAIL_WRONGCREDS}
SUDO %{SUDO_INFO}|%{SUDO_ERROR}|%{SUDO_PUBLICKEY}|%{SSH}|%{SUDO_OPEN_PAM}|%{SUDO_REMOVE_SESSION}|%{SUDO_PAM_OPEN2}|%{SUDO_SEAT}
Using some experimentation with http://grokdebug.herokuapp.com/ and finding some known patterns online. You can easily add more patterns to grab a lot of different log types. All of these can be pulled out in a processor, as seen in the above diagram. Source Code https://github.com/tspannhw/nifi-logs Example Template sudo.xml
... View more
Labels:
01-25-2019
09:01 PM
2 Kudos
Introduction
SoChain provides a fast set of public freely available APIs (don't abuse them) to access information on various networks.
If you need this for critical work, please donate: https://chain.so/address/devfund.
One of the things you will see in this simple flow is that NiFi excels in ingesting REST and working with JSON. As you can see with split it up, shred it, filter it, manipulate and extract from it. With the resulting usable objects we build a schema that will also us to do record processing. Once we have a set of records with a schema I can store it to
I just hosted a Future of Data Princeton Meetup in Woodbridge New Jersey with some amazing speakers sponsored by ChainNinja. While this was all about Blockchain for Enterprise and no cryptocurrency was involved it made we want to investigate some cryptocurrency data. As you can see, manipulating complex JSON data, filtering, modifying, routing and scripting with it's values is trivial in Apache NiFi.
In my next article I am investigating Hyperledger and Ethereum for enterprise solutions integration with Apache NiFi, Impala, Hive, Kudu, HBase, Spark, Kafka and other enterprise technologies. Steps We read from the URL I send the original file to immutable HDFS storage. In another branch, I will use EvaluateJSONPath to pull out one attribute to use to get detail records. $.data.blocks I use that attribute to build a deeper REST call to get the details for the latest block. https://chain.so/api/v2/block/BTC/${block_no} This is in invokeHTTP which is a scriptable HTTP(s) call. This comes in handy often. In the next EvaluateJSONPath I pull out all the high level attributes of the JSON file. I want these for all the records as master fields. These are repeated. After that I split out the two arrays of data beneath that into two separate branches. I will breach these down into individual records for parsing. I could also apply a schema and handle these are groups of records. This is an example of reading a REST API and creating a unique name per call. Also notice it's easy to handle HTTPS as well as HTTP.
SoChain Ingest Flow for REST APIs Calls
Example Unique File Name we can script
${filename:append('btc.'):append(${now():format('yyyymmddHHMMSS'):append(${md5}):append('.json')})}
REST URLs
https://chain.so/api/v2/get_info/BTC
https://chain.so/api/v2/get_price/BTC/USD
https://chain.so/api/v2/get_info/DOGE
https://chain.so/api/v2/get_info/LTC
Example of the Value of the Apache NiFi Provenance. (These are the attributes acquired for one flowfile).
Attribute Values
Access-Control-Allow-Headers
Origin,Accept,Content-Type,X-Requested-With,X-CSRF-Token
Access-Control-Allow-Methods
GET,POST
Access-Control-Allow-Origin
*
CF-RAY
49e564b17e23923c-EWR
Cache-Control
no-cache, no-store, max-age=0, must-revalidate
Connection
keep-alive
Content-Type
application/json; charset=utf-8
Date
Thu, 24 Jan 2019 20:54:07 GMT
Expect-CT
max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Expires
Fri, 01 Jan 1990 00:00:00 GMT
Pragma
no-cache
Server
cloudflare
Set-Cookie
__cfduid=d6f52ee1552c73223442296ff7230e9fd1548363246; expires=Fri, 24-Jan-20 20:54:06 GMT; path=/; domain=.chain.so; HttpOnly, _mkra_ctxt=1a7dafd219c4972a7562f232dc63f524--200; path=/; max-age=5
Status
200 OK
Strict-Transport-Security
max-age=31536000;includeSubDomains
Transfer-Encoding
chunked
X-Content-Type-Options
nosniff
X-Download-Options
noopen
X-Frame-Options
SAMEORIGIN
X-Request-Id
20d3f592-50b6-40cf-a496-a6f915eb463b
X-Runtime
1.018401
X-XSS-Protection
1; mode=block
bits
172fd633
block_no
559950
blockhash
0000000000000000001c68f61ddcc30568536a583c843a7d0c9606b9582fd7e5
fee
0.05142179
filename
btc.201949241501759.json
fragment.count
1
fragment.identifier
cec10691-82e9-402b-84a9-7901b084f10a
fragment.index
0
gethttp.remote.source
chain.so
invokehttp.remote.dn
CN=ssl371663.cloudflaressl.com,OU=PositiveSSL Multi-Domain,OU=Domain Control Validated
invokehttp.request.url
https://chain.so/api/v2/block/BTC/559950
invokehttp.status.code
200
invokehttp.status.message
OK
invokehttp.tx.id
bc8a0a18-0685-4a2c-97fa-34541b9ea929
merkleroot
41eb6f68477e96c9239ae1bbe4e5d4d02529c6f7faebc4ad801730d09609a0ef
mime.type
application/json; charset=utf-8
mining_difficulty
5883988430955.408
network
BTC
next_blockhash
Empty string set
nonce
1358814296
path
./
previous_blockhash
0000000000000000001b2b3d3b5741462fe31981a6c0ae9335ed8851e936664b
schema
chainsotxinputinfo
schema.name
chainsotxinputinfo
segment.original.filename
btc.201949241501759.json
sent_value
3977.10078351
size
470242
time
1548362873
uuid
3c1d72b4-e993-4b32-a679-0741a44aeefb
An example input record:
{
"input_no" : 0,
"address" : "3N7Vid17hE1ofGcWR6bWEmtQBQ8kKQ7iKW",
"value" : "0.20993260",
"received_from" : {
"txid" : "4e0f00cddb8e3d98de7f645684dc7526468d1dc33efbbf0bc173ed19c6556896",
"output_no" : 4
}
}
An Example LiteCoin Record
{
"status" : "success",
"data" : {
"name" : "Litecoin",
"acronym" : "LTC",
"network" : "LTC",
"symbol_htmlcode" : "Ł",
"url" : "http://www.litecoin.com/",
"mining_difficulty" : "6399667.35869154",
"unconfirmed_txs" : 8,
"blocks" : 1567929,
"price" : "0.00000000",
"price_base" : "BTC",
"price_update_time" : 1548451214,
"hashrate" : "178582229079753"
}
}
Example NiFi Flow chainso.xml
... View more
Labels:
01-24-2019
09:34 AM
2 Kudos
See https://community.hortonworks.com/questions/107816/unable-to-get-schema-registry-working-in-hdf-30.html https://github.com/hortonworks/registry/issues/339 You need permissions to your database from SAM. Did you install from HDF Ambari? Did you setup the database for SAM? https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.3.1/getting-started-with-streaming-analytics/content/building_an_end-to-end_stream_application.html SAM needs HDP https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.3.1/installing-hdf-and-hdp/content/deploying_an_hdp_cluster_using_ambari.html
... View more
01-22-2019
09:50 PM
4 Kudos
Working with a Proximity Beacon Network Part 1 Introduction: In a retail environment, we want to be able to interact with people within the store. Our beacons can provide hyperlocalized information and also help us determine what's going on in the store for traffic patterns. Our beacons are also giving us temperature and other reading as any sensor we may have in our IoT retail environment. I have set up three Estimote Proximity beacons in an indoor environment broadcasting both Estimote and IBeacon messages. iBeacon is a Bluetooth advertising protocol by Apple that is built into iPhones. In this article we ingest, filter and route the data based on the beacon IDs. In a following article we will stream our data to a data store(s) and run machine learning and analytics on our streaming time series data. I tested the BLE library from the command line with a Python script: Cloud Setup Since we are using our own IoT gateways and networks, I am only using the Estimote Cloud to check the beacons and make sure they are registered. Estimote provides an IPhone Application (Estimote) that you can download and use to do some basic programming and range testing of the beacons. MiniFi Setup: We run our shell script that has the Python BLE scanner run for 40 seconds and add JSON rows to a file. We then continuously tail that file and read new lines and send them to NiFi for processing. As you can see MiniFi is sending a steady stream of data to an Apache NiFi instance via HTTP S2S API. NiFi Setup: The flow is pretty simple. We add a schema name to it, split the text to get one JSON row per line. We calculate record stats, just to check. The main logic is the Partition Record to help us partition records into different categories based on their Estimote Beacon IDs. We then route based on those partitions. We can do different filtering and handling after that if we need to. We partition our JSON records into AVRO records and pull out the estimote id as a new attribute to use for routing. In the Route we look at the partition key which is an address for the device. Software: Apache NiFi 1.8.0, MiniFi 0.5.0, Java JDK 1.8, Ubuntu 16.04, Apple IPhone SE, Python BLE / Beacon Libraries. Beacon Types: iBeacon, Estimote Networks: BLE and WiFi Source Code: https://github.com/tspannhw/minifi-estimote Schema: { "type" : "record", "name" : "estimote", "fields" : [ { "name" : "battery", "type" : "int", "doc" : "Type inferred from '80'" }, { "name" : "id", "type" : "string", "doc" : "Type inferred from '\"47a038d5eb032640\"'" }, { "name" : "magnetic_fieldz", "type" : "double", "doc" : "Type inferred from '-0.1484375'" }, { "name" : "magnetic_fieldx", "type" : "double", "doc" : "Type inferred from '-0.3125'" }, { "name" : "magnetic_fieldy", "type" : "double", "doc" : "Type inferred from '0.515625'" }, { "name" : "end", "type" : "string", "doc" : "Type inferred from '\"1547679071.5\"'" }, { "name" : "temperature", "type" : "double", "doc" : "Type inferred from '25.5'" }, { "name" : "cputemp1", "type" : "double", "doc" : "Type inferred from '38.0'" }, { "name" : "memory", "type" : "double", "doc" : "Type inferred from '26.1'" }, { "name" : "protocol_version", "type" : "int", "doc" : "Type inferred from '2'" }, { "name" : "current_motion", "type" : "int", "doc" : "Type inferred from '420'" }, { "name" : "te", "type" : "string", "doc" : "Type inferred from '\"0.362270116806\"'" }, { "name" : "systemtime", "type" : "string", "doc" : "Type inferred from '\"01/16/2019 17:51:11\"'" }, { "name" : "cputemp", "type" : "double", "doc" : "Type inferred from '39.0'" }, { "name" : "uptime", "type" : "int", "doc" : "Type inferred from '4870800'" }, { "name" : "host", "type" : "string", "doc" : "Type inferred from '\"Laptop\"'" }, { "name" : "diskusage", "type" : "string", "doc" : "Type inferred from '\"418487.1\"'" }, { "name" : "ipaddress", "type" : "string", "doc" : "Type inferred from '\"192.168.1.241\"'" }, { "name" : "uuid", "type" : "string", "doc" : "Type inferred from '\"20190116225111_2cbbac13-fed0-4d81-a24a-3aa593b5f674\"'" }, { "name" : "is_moving", "type" : "boolean", "doc" : "Type inferred from 'false'" }, { "name" : "accelerationy", "type" : "double", "doc" : "Type inferred from '0.015748031496062992'" }, { "name" : "accelerationx", "type" : "double", "doc" : "Type inferred from '0.0'" }, { "name" : "accelerationz", "type" : "double", "doc" : "Type inferred from '1.0236220472440944'" }, { "name" : "starttime", "type" : "string", "doc" : "Type inferred from '\"01/16/2019 17:51:11\"'" }, { "name" : "rssi", "type" : "int", "doc" : "Type inferred from '-60'" }, { "name" : "bt_addr", "type" : "string", "doc" : "Type inferred from '\"fa:e2:20:6e:d4:a5\"'" } ] }
Python Snippet: from beacontools import parse_packet
from beacontools import BeaconScanner, EstimoteTelemetryFrameA, EstimoteTelemetryFrameB, EstimoteFilter telemetry_b_packet = b"\x02\x01\x04\x03\x03\x9a\xfe\x17\x16\x9a\xfe\x22\x47\xa0\x38\xd5" b"\xeb\x03\x26\x40\x01\xd8\x42\xed\x73\x49\x25\x66\xbc\x2e\x50"
telemetry_b = parse_packet(telemetry_b_packet)
telemetry_a_packet = b"\x02\x01\x04\x03\x03\x9a\xfe\x17\x16\x9a\xfe\x22\x47\xa0\x38\xd5" b"\xeb\x03\x26\x40\x00\x00\x01\x41\x44\x47\xfa\xff\xff\xff\xff"
telemetry = parse_packet(telemetry_a_packet) Example Data: {"battery": 80, "id": "47a038d5eb032640", "magnetic_fieldz": -0.1484375, "magnetic_fieldx": -0.3125, "magnetic_fieldy": 0.515625, "end": "1548194024.99", "temperature": 25.5, "cputemp1": 45.0, "memory": 42.6, "protocol_version": 2, "current_motion": 420, "te": "39.767373085", "systemtime": "01/22/2019 16:53:44", "cputemp": 43.0, "uptime": 4870800, "host": "Laptop", "diskusage": "418124.2", "ipaddress": "192.168.1.241", "uuid": "20190122215344_2a41168e-31da-4ae7-bf62-0b300c69cd5b", "is_moving": false, "accelerationy": 0.015748031496062992, "accelerationx": 0.0, "accelerationz": 1.0236220472440944, "starttime": "01/22/2019 16:53:05", "rssi": -63, "bt_addr": "fa:e2:20:6e:d4:a5"} We have several values from the Ubuntu MiniFi host machine:
host diskuage ipaddress cputemp memory We have important values from the three beacons:
battery magnetic_field(x, y, z) current_motion id bt_addr rssi estimoteid temperature Reference Articles:
https://community.hortonworks.com/articles/99861/ingesting-ibeacon-data-via-ble-to-mqtt-wifi-gatewa.html https://community.hortonworks.com/articles/108947/minifi-for-ble-bluetooth-low-energy-beacon-data-in.html https://community.hortonworks.com/articles/131320/using-partitionrecord-grokreaderjsonwriter-to-pars.html Resources:
https://en.wikipedia.org/wiki/Bluetooth_low_energy_beacon https://cloud.estimote.com/#/beacons https://developer.estimote.com/ibeacon/ https://developer.apple.com/ibeacon/Getting-Started-with-iBeacon.pdf https://pypi.org/project/beacontools/ https://www.instructables.com/id/iBeacon-Entry-System-with-the-Raspberry-Pi-and-Azu/#step0 https://github.com/switchdoclabs/iBeacon-Scanner- https://developer.estimote.com/android/tutorial/part-1-setting-up/ https://developer.estimote.com/ https://github.com/flyinactor91/RasPi-iBeacons https://github.com/GillisWerrebrouck/BeaconScanner https://github.com/emanuele-falzone/pedestrian-gate-automation https://github.com/biagiobotticelli/SmartTeamTrackingServer https://github.com/citruz/beacontools/blob/master/examples/parser_example.py https://github.com/citruz/beacontools/blob/master/beacontools/packet_types/ibeacon.py
... View more
Labels:
01-18-2019
09:29 PM
3 Kudos
I need to parse Kerberos KDC Log files (including the currently filling file) to find users with their host that are connecting. It seems using Grok in NiFi we can parse out a lot of different parts of these files and use them for filtering and alerting with ease. This is what many of the lines in the log file look like: Jan 01 03:31:01 somenewserver-310 krb5kdc[28593](info): AS_REQ (4 etypes {18 17 16 23}) 192.168.237.220: ISSUE: authtime 1546278185, etypes {rep=18 tkt=16 ses=18}, nn/somenewserver-310.field.hortonworks.com@HWX.COM for nn/somenewserver-310.field.hortonworks.com@HWX.COM State of the Tail Processor Tail a File We also have the option of using the GrokReader listed in an article included to immediately convert matching records to output formats like JSON or Avro and then partition into groups. We'll do that in a later article. In this one, we can get a line from the file via Tail, read a list of files and fetch one at a time or generate a flow file for testing. Once we had some data we'll start parsing into different message types. These messages can then be use for alerting, routing, permanent storage in Hive/Impala/HBase/Kudu/Druid/S3/Object Storage/etc... In the next step we will do some routing and alerting. Follow up by some natural language processing (NLP), machine learning and then we'll use various tools to search, aggregate, query, catalog, report on and build dashboards from this type of log and others.
Example Output JSON Formatted
PREAUTH
{
"date" : "Jan 07 02:25:15",
"etypes" : "2 etypes {23 16}",
"MONTH" : "Jan",
"HOUR" : "02",
"emailhost" : "cloudera.net",
"TIME" : "02:25:15",
"pid" : "21546",
"loghost" : "KDCHOST1",
"kuser" : "krbtgt",
"message" : "Additional pre-authentication required",
"emailuser" : "user1",
"MINUTE" : "25",
"SECOND" : "15",
"LOGLEVEL" : "info",
"MONTHDAY" : "01",
"apphost" : "APP_HOST1",
"kuserhost" : "cloudera.net@cloudera.net"
}
ISSUE
{
"date" : "Jan 01 03:20:09",
"etypes" : "2 etypes {23 18}",
"MONTH" : "Jan",
"HOUR" : "03",
"BASE10NUM" : "1546330809",
"emailhost" : "cloudera.net",
"TIME" : "03:20:09",
"pid" : "24546",
"loghost" : "KDCHOST1",
"kuser" : "krbtgt",
"message" : "",
"emailuser" : "user1",
"authtime" : "1546330809",
"MINUTE" : "20",
"SECOND" : "09",
"etypes2" : "rep=23 tkt=18 ses=23",
"LOGLEVEL" : "info",
"MONTHDAY" : "01",
"apphost" : "APP_HOST1",
"kuserhost" : "cloudera.net@cloudera.net"
}
Grok Expressions
For Parsing Failure Records
%{SYSLOGTIMESTAMP:date} %{HOSTNAME:loghost} krb5kdc\[%{POSINT:pid}\]\(%{LOGLEVEL}\): %{GREEDYDATA:premessage}failure%{GREEDYDATA:postmessage}
For Parsing PREAUTH Records
%{SYSLOGTIMESTAMP:date} %{HOSTNAME:loghost} krb5kdc\[%{POSINT:pid}\]\(%{LOGLEVEL}\): AS_REQ \(%{GREEDYDATA:etypes}\) %{GREEDYDATA:apphost}: NEEDED_PREAUTH: %{USERNAME:emailuser}@%{HOSTNAME:emailhost} for %{GREEDYDATA:kuser}/%{GREEDYDATA:kuserhost}, %{GREEDYDATA:message}
For Parsing ISSUE Records
%{SYSLOGTIMESTAMP:date} %{HOSTNAME:loghost} krb5kdc\[%{POSINT:pid}\]\(%{LOGLEVEL}\): AS_REQ \(%{GREEDYDATA:etypes}\) %{GREEDYDATA:apphost}: ISSUE: authtime %{NUMBER:authtime}, etypes \{%{GREEDYDATA:etypes2}\}, %{USERNAME:emailuser}@%{HOSTNAME:emailhost} for %{GREEDYDATA:kuser}/%{GREEDYDATA:kuserhost}%{GREEDYDATA:message}
Resources:
For Testing Grok Against Your Files
http://grokdebug.herokuapp.com/
A Great Article on Using GrokReader for Record Oriented Processing
https://community.hortonworks.com/articles/131320/using-partitionrecord-grokreaderjsonwriter-to-pars.html More About Grok https://datahovel.com/2018/07/ https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.7.1/org.apache.nifi.grok.GrokReader/additionalDetails.html http://grokconstructor.appspot.com/do/automatic?example=0 https://gist.github.com/acobaugh/5aecffbaaa593d80022b3534e5363a2d
... View more
Labels:
01-11-2019
07:44 PM
Very nice! I did two articles a few years ago, but things have advanced greatly. https://community.hortonworks.com/articles/88404/adding-and-using-hplsql-and-hivemall-with-hive-mac.html https://community.hortonworks.com/articles/67983/apache-hive-with-apache-hivemall.html
... View more
01-11-2019
02:50 PM
Whenever you send something from minifi to nifi over Site to Site, as soon as it arrives in NiFi it has all the provenance data.
... View more
01-07-2019
08:46 PM
6 Kudos
Ingesting Drone Data From DJII Ryze Tello Drones Part 1 - Setup and Practice In Part 1, we will setup our drone, our communication environment, capture the data and do initial analysis. We will eventually grab live video stream for object detection, real-time flight control and real-time data ingest of photos, videos and sensor readings. We will have Apache NiFi react to live situations facing the drone and have it issue flight commands via UDP. In this initial section, we will control the drone with Python which can be triggered by NiFi. Apache NiFi will ingest log data that is stored as CSV files on a NiFi node connected to the drone's WiFi. This will eventually move to a dedicated embedded device running MiniFi. This is a small personal drone with less than 13 minutes of flight time per battery. This is not a commercial drone, but gives you an idea of the what you can do with drones. Drone Live Communications for Sensor Readings and Drone Control You must connect to the drone's WiFi, which will be Tello(Something). Tello IP: 192.168.10.1 UDP PORT:8889 Receive Tello Video Stream Tello IP: 192.168.10.1 UDP Server: 0.0.0.0 UDP PORT:11111 Example Install: pip3.6 install tellopy
git clone https://github.com/hanyazou/TelloPy
pip3.6 install av
pip3.6 install opencv-python
pip3.6 install image
python3.6 -m tellopy.examples.video_effect Example Run Video: https://www.youtube.com/watch?v=mYbStkcnhsk&t=0s&list=PL-7XqvSmQqfTSihuoIP_ZAnN7mFIHkZ_e&index=18 Example Flight Log: Tello-flight log.pdf. Let's build a quick ingest with Apache NiFi 1.8. Our first step we use a local Apache NiFi to read the CSV from the Drone rune locally. We read the CSVs from the Tello logging directory, add a schema definition and query it. We have a controller for CSV processing. We are using the posted schema and the Jackson CSV processor. We want to ignore the header as it has invalid characters. We use a QueryRecord to find if the position in Z has changed. SELECT * FROM FLOWFILE WHERE mvo_pos_z is not null AND CAST(mvo_pos_z as FLOAT) <> 0.0 We also convert from CSV to Apache AVRO format for further processing. Valid records are sent over HTTP(S) Site-to-Site to a cloud hosted Apache NiFi cluster for further processing to save to an HBase table. As you can see it's trival to store these records in HBase. For HBase, our data didn't have a record identifier, so I use the UpdateRecord processor to create one and add it to the data. I updated the schema to have this field (and have a default and allow nulls). As you can see it's pretty easy to store data to HBase. Schema: { "type" : "record", "name" : "drone",
"fields" : [
{ "name" : "drone_rec_id", "type" : [ "string", "null" ], "default": "1000" },
{ "name" : "mvo_vel_x", "type" : ["double","null"], "default": "0.00" },
{ "name" : "mvo_vel_y", "type" : ["string","null"], "default": "0.00" },
{ "name" : "mvo_vel_z", "type" : ["double","null"], "default": "0.00" },
{ "name" : "mvo_pos_x", "type" : ["string","null"], "default": "0.00" },
{ "name" : "mvo_pos_y", "type" : ["double","null"], "default": "0.00"},
{ "name" : "mvo_pos_z", "type" : ["string","null"], "default": "0.00" },
{ "name" : "imu_acc_x", "type" : ["double","null"], "default": "0.00" },
{ "name" : "imu_acc_y", "type" : ["double","null"], "default": "0.00" },
{ "name" : "imu_acc_z", "type" : ["double","null"], "default": "0.00" },
{ "name" : "imu_gyro_x", "type" : ["double","null"], "default": "0.00" },
{ "name" : "imu_gyro_y", "type" : ["double","null"], "default": "0.00" },
{ "name" : "imu_gyro_z", "type" : ["double","null"], "default": "0.00" },
{ "name" : "imu_q0", "type" : ["double","null"], "default": "0.00" },
{ "name" : "imu_q1", "type" : ["double","null"], "default": "0.00" },
{ "name" : "imu_q2", "type" : ["double","null"], "default": "0.00" },
{ "name" : "self_q3", "type" : ["double","null"], "default": "0.00" },
{ "name" : "imu_vg_x", "type" : ["double","null"], "default": "0.00" },
{ "name" : "imu_vg_y", "type" : ["double","null"], "default": "0.00" },
{ "name" : "imu_vg_z", "type" : ["double","null"], "default": "0.00" } ] }
The updated schema now has a record id. The original schema derived from the raw data does not. Store the Data in HBase Table Soon we will be storing in Kudu, Impala, Hive, Druid and S3. create 'drone', 'drone' Source: We are using the TelloPy interface. You need to clone this github and drop in the files from nifi-drone. https://github.com/hanyazou/TelloPy/ https://github.com/tspannhw/nifi-drone Apache NiFi Flows: dronelocal.xml dronecloud.xml References: https://github.com/hanyazou/TelloPy https://gobot.io/blog/2018/04/20/hello-tello-hacking-drones-with-go/ https://github.com/grofattila/dji-tello https://github.com/dbaldwin/droneblocks-tello-python https://medium.com/@makerhacks/programming-the-ryze-dji-tello-with-python-eecd56fc2c27 https://github.com/Ubotica/telloCV/ https://www.instructables.com/id/Ultimate-Intelligent-Fully-Automatic-Drone-Robot-w/ https://github.com/hybridgroup/gobot/tree/master/platforms/dji/tello https://www.ryzerobotics.com/tello https://dl-cdn.ryzerobotics.com/downloads/Tello/20180404/Tello_User_Manual_V1.2_EN.pdf https://dl-cdn.ryzerobotics.com/downloads/Tello/20180212/Tello+Quick+Start+Guide_V1.2+multi.pdf https://dl-cdn.ryzerobotics.com/downloads/tello/20180910/Tello%20Scratch%20README.pdf https://dl-cdn.ryzerobotics.com/downloads/tello/20180910/scratch0907.7z https://www.ryzerobotics.com/tello/downloads https://www.hackster.io/econnie323/alexa-voice-controlled-tello-drone-760615 https://tellopilots.com/forums/tello-development.8/ https://medium.com/@swalters/dji-ryze-tello-drone-gets-reverse-engineered-46a65d83e6b5 http://www.fabriziomarini.com/2018/04/java-udp-drone-tello.html?m=1 https://github.com/microlinux/tello/blob/master/tello.py https://github.com/hybridgroup/gophercon-2018/blob/master/drone/tello/README.md https://tellopilots.com/threads/object-tracking-with-tello.1480/ https://github.com/gnamingo/jTello/blob/master/JTello.java https://github.com/microlinux/tello/blob/master/README.md https://steemit.com/python/@makerhacks/programming-the-ryze-dji-tello-with-python https://github.com/hanyazou/TelloPy https://github.com/dji-sdk/Tello-Python https://github.com/Ubotica/telloCV/ https://github.com/dji-sdk/Tello-Python/tree/master/Tello_Video_With_Pose_Recognition https://github.com/DaWelter/h264decoder https://github.com/twilightdema/h264j http://jcodec.org/ https://github.com/cisco/openh264 https://github.com/hanyazou/TelloPy/blob/develop-0.7.0/tellopy/examples/video_effect.py https://gobot.io/blog/2018/04/20/hello-tello-hacking-drones-with-go/ https://medium.com/@swalters/dji-ryze-tello-drone-gets-reverse-engineered-46a65d83e6b5
... View more
Labels:
01-07-2019
06:17 PM
It will not work. https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+and+Java+9%2C+10%2C+11 It is still being developed. Lots of things deprecated and changed in JDK 11
... View more
12-29-2018
06:15 AM
An example of an image.
... View more