Member since
08-13-2019
37
Posts
25
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3202 | 12-31-2018 08:44 AM | |
987 | 12-18-2018 08:39 PM | |
891 | 08-27-2018 11:29 AM | |
1778 | 10-12-2017 08:35 PM | |
1241 | 08-06-2017 02:57 PM |
01-07-2019
12:27 PM
1 Kudo
Hi @haco fayik That looks great. Sounds like you got around the initial problem of ingesting data into Metron. There could be multiple reasons, e.g. parser, enrichment and indexing topologies not running or being misconfigured. Would you create a new question for this and provide more details, such as worker logs of those topologies? Would you also mark the answer that helped you most solve the ingest problem as "Best Answer"? thanks!
... View more
12-31-2018
12:59 PM
1 Kudo
@haco fayik There's many ways to do this. You should probably search this community in the NiFi section or get familiar with NiFi in general. However, as a a short overview, the most common cases for Metron ingestion, I'm encountering in the field are:
your sources are pushing the message to a syslog server. You can configure your syslog server to push data to your NiFi instance over TCP or UDP. In this case you'd need a "ListenSyslog" processor and a "PublishKafka" processor. you already have a log forwarder capable of pushing data to Kafka (winlogbeats 😞 https://www.elastic.co/guide/en/beats/winlogbeat/current/configuring-output.html . In this case you won't need NiFi, if you are comfortable using winlogbeats. You install MiNiFi on all servers to act as a simple log forwarder over tcp. You'd send those packets to a NiFi instance/cluster (similar to the Syslog approach), receive them via "ListenTcp" processor and push your messages into Kafka using the "PublishKafka" processor. You could also send data directly into Kafka from MiNiFi. Note: If your Kafka cluster is secured with Kerberos, this might influence your choice.
... View more
12-31-2018
08:44 AM
1 Kudo
Hi @haco fayik, as a starting point you need to push data into a parser specific Kafka topic (you can call the topic "windows-event-log"), and configure a parser in the Metron Management UI and start it. In the parser configuration you configure Metron, from which Kafka topic the messages are picked up ("windows-event-log" in our case) and how to parse the incoming messages. NiFi is a great tool to collect data from various sources and push it into Kafka. Maybe my article helps you: https://datahovel.com/2018/07/18/how-to-onboard-a-new-data-source-in-apache-metron/ If you have more specific questions, don't hesitate to ask!
... View more
12-31-2018
08:31 AM
Hi @Amirul seems like you got it working, because you are seeing events in elastic. 🙂 You mention an "Indexing error". Do you have a log snippet that shows the error message?
... View more
12-18-2018
08:39 PM
Hi @Amirul Yes, you need to create a template. Best way of creating it, is using an existing one of the example parsers that are delivered with HCP/Metron and modifying it to fit your new parser. You'd need at least a section in that template with: "metron_alert" : {
"type" : "nested"
} Here, I've written a small blog post about what you need to take care about, when you create a template: https://datahovel.com/2018/11/27/how-to-define-elastic-search-templates-for-apache-metron/ Here is the official documentation that describes that you need a template: https://metron.apache.org/current-book/metron-platform/metron-elasticsearch/index.html
... View more
12-11-2018
10:55 AM
Which version of HDP are you using? How many workers, spouts and ackers did you configure for you parser topology?
... View more
12-11-2018
10:52 AM
1 Kudo
@Amirul Are you using Ambari? Or have you installed Metron manually? If you are using Ambari, it would try to create the table upon every restart if it didn't exist. If that doesn't happen you might have a permissions problem. You can check the HBase access audit logs if you run into this.
... View more
08-27-2018
11:38 AM
@mojgan ghasemi You could start by defining it via Grok using the GrokParser. SYSLOG_HEADER + MESSAGE, where the SYSLOG_HEADER could look like: <%{POSINT:syslog_priority}>%{SYSLOGTIMESTAMP:date} %{IPORHOST:device} Or write your own Java Parser: https://metron.apache.org/current-book/metron-platform/metron-parsers/3rdPartyParser.html In future Metron versions a syslog parsing capability is planned: (https://issues.apache.org/jira/browse/METRON-1453)
... View more
08-27-2018
11:29 AM
Hi @Sarvesh Kumar Apache Metron gives you all the tools you need to
extract and parse the information from your event. So if the event's message contains the information about if the device has shutdown, you'll be able to create a rule around it. aggregate data and create profiles of devices in certain time windows. So you could create a small function that evaluates the status of a device in a certain time frame and check if the device is up. Disk memory full: If the event source contains the current disk space (and ideally also sends the maximum amount of disk space available) it's just a simple rule to add to create an alert. Regarding your unsupervised learning question:
Your examples don't require machine learning, because they are rule based. You'd want to use machine learning to train a model that generates alerts based on data rather than on rules. (in most cases this is "supervised" learning based on "is alert" or "is not alert"). However, Metron provides a "Model as a Service" capabilty, which allows you to deploy models to evaluate events and enrich them. That being said, Metron does not provide models for you. Creating features and models is the data scientists job and depending how thoroughly this is done, this will determine how many accurate alerts (ideally all of them) and how many false positives you have (ideally none). Hope that helped!
... View more
07-27-2018
01:14 PM
@Napoleon Treizieme you need to set your timeFields field like this: ["datetime"] not like "[datetime]"
... View more
07-13-2018
11:04 AM
@abu ameen Most probably there's too many open connections to Zookeeper on that host. Zookeeper limits the maximum possible connections from a single host. Per default this is 60. If there are 60 open connections to zookeeper from a single host, no further connections are opened. This is why the script fails. To check from where the connections are opened execute: netstat -nape | awk '{if ($5 == "ADD_MY_ZOOKEEPER_IP_HERE:2181") print $4, $9;}' | sort | uniq -c Identify the process ids of the applications opening connections: 1 192.168.123.1:45162 577/java In this case it's 577 Check which application this is. It might be the Metron REST server. In this case you can kill it and restart it.
... View more
07-13-2018
08:10 AM
Summary Using Apache Solr as the indexing and search engine for Metron requires the Metron REST service to perform queries to multple collections. If the Ranger plugin is active there is currently a gotcha ( = Ranger Solr plugin bug). If you don't want to give the Metron user full access to all Solr collections here is a workaround. The Problem 2+ Solr collections that are being queried: metaalert, cef,.... (and other parser collections): 1 user: metron 1 Ranger policy: user: "metron", access type: "Read", "Write", collections: "metaalert", "cef" Query of metaalert collection returns content of metaalert collection as expected and logs event successfully in Ranger audit. curl -k --negotiate -u : "http://solr_url:solr_port/solr/metaalert/search?q=*" Query of cef collection returns content of cef collection as expected and logs it successfully in Ranger audit. curl -k --negotiate -u : "http://solr_url:solr_port/solr/cef/search?q=*" Query of metaalert and cef will return a "403 Unauthorized request". This is what the Metron REST server does: curl -k --negotiate -u : "http://solr_url:solr_port/solr/metaalert/select?q=*&collections=metaalert,cef" In Ranger audit we now see 3 lines:
user: metron, resource: metaalert,cef, Result: Denied user: metron, resource: metaalert, Result: Allowed user: metron, resource: cef, Result: Allowed Expectation would be that query is successfull! Workaround(s) One workaround would be to give metron access to all collections: "*" . We usually don't want that on clusters, that are being used by other use cases. Another workaround would be to give metron access to "*metaalert*" collection.
... View more
- Find more articles tagged with:
- CyberSecurity
- hcp
- Issue Resolution
- Ranger
- ranger-plugins
- Security
- solr
Labels:
07-12-2018
07:48 AM
2 Kudos
Hi Dobromir, Just a short (maybe not satisfying) answer: You shouldn’t use the hive user to write data, but only for operations. If you need to use the hive user for some reason: you probably use Apache Ranger to authorize: Why don’t you give the hive user access to the data on HDFS level if it *really* needs to write there? If you are not managing permissions with Ranger (which is recommended) you should look into: “umask” to change the default permissions of newly created directories and files or “hdfs acls” to give users and groups access to HDFS without Ranger Best, Stefan
... View more
03-30-2018
09:43 AM
1 Kudo
@Prashant Band @Robert Fidler @William Ardianto
I assume you are running on a secure cluster (Kerberos is active)? right?
Cleanly (let your admin) delete the /tmp/${user.name} directory.
Make sure that your Kerberos principal you are using to run the jobs matches the OS user!
myusername@mymachine $ kinit myusername
Make sure the OS user exists on all (worker) nodes of the cluster.
It's bad practice to submit a job as OS user that is different to the Kerberos principal used to authenticate at the service.
... View more
12-01-2017
09:44 PM
@nshelke Thanks worked fine. Tried to configure it in HA mode analogously to the HIVE service, but it didn't work out. Did you try it in HA mode as well?
... View more
11-12-2017
08:20 PM
@vishwa ".... end end user in hive CLI" That explains your issue 🙂 . You shouldn't try to use Hive CLI to connect to Hiveserver2. You should be using beeline. Hope that solves this issue for you 😉
... View more
11-09-2017
07:57 PM
@vishwa You shouldn't set "hive.server2.enable.doAs=true" in the hive-interactive section. This doesn't make sense from a resources point of view. However you can set it on the main config page. These two settings are independent, even though they have the same name. Either way you can access tables as the user you are authenticated against the Hiveserver and use fine grained authorization on your tables. The only difference is the user of the system processes running the queries. With hive.server2.enable.doAs=true the query runs in a container owned by the authenticated user, while "...=false" runs it as the hive user.
... View more
11-09-2017
08:05 AM
Hi @vishwa LLAP doesn't have any issue with it, it's simply ignored. So you can run your batch Hive instances in RunAs mode, while your Hive interactive (LLAP) server runs your jobs as hive user. Your issue seems to be: "No running LLAP daemons!". In order to run a job, you should first bring up the LLAP daemons cleanly. If that fails have a look at the LLAP daemon logs in YARN and check why those are not coming up or crashing.
... View more
10-12-2017
08:35 PM
@Srikanth Gorripati
On infrastructure side you need to have the following in place:
A running HBase cluster
At least one HBase REST server up and running. If you have more you can configure Knox in HA mode
Knox configured to point to the HBase REST server url and port(s).
You can get some help troubleshooting Knox here.
On application side you need to:
Use a library that can perform requests to HTTP servers. (There's many and you probably have a favourite one?)
Direct your call to the Knox url:port endpoint using the path of the defined topology and the HBase service /gateway/default/hbase
Use the correct path of the HBase REST API defined in the HBase book
... View more
09-01-2017
12:55 PM
Yes, if you want to be sure that the user say who they say they are, you have to use Kerberos. Kerberos *is* for internal authentication. People/services who come from outside can avoid accessing the cluster via Kerberos by using Knox as the secure gateway to Hadoop Ranger handles authorization via plugins. Ranger can also be used as centralized authorization platform in a cluster, that is not kerberized. ( Ranger = Authorization, Kerberos = Authentication) Phoenix tables are actually HBase tables. So you'd need to create Ranger policies for the HBase plugin to authorize users on Phoenix tables. Start reading an article I posted earlier here. It will explain the security concepts: https://community.hortonworks.com/content/kbentry/102957/hadoop-security-concepts.html
... View more
08-06-2017
02:57 PM
1 Kudo
Hi @Rishabh Oberoi The Kerberos principal and the OS user don't have much in common. Each OS user can authenticate as multiple Kerberos principals. The Kerberos principal is stored in a file called the "ticket cache". You can see which principal you are at the moment using the "klist" command. Just type "klist". In this example I am authenticated as "jimmy.page" in the Kerberos REALM "FIELD.HORTONWORKS.COM". $ klist
Ticket cache: FILE:/tmp/krb5cc_1960402946
Default principal: jimmy.page@FIELD.HORTONWORKS.COM
Valid starting Expires Service principal
08/06/2017 14:47:12 08/07/2017 00:47:12 krbtgt/FIELD.HORTONWORKS.COM@FIELD.HORTONWORKS.COM
renew until 08/13/2017 14:47:12
Without kinit you shouldn't have a ticket in the ticket cache and therfore see something like $ klist
klist: No credentials cache found (filename: /tmp/krb5cc_1960402946) Before you do any "hadoop" or "hdfs" commands you should check with klist if you are authenticated and if you are authenticated as the user you want to be. Thus, independently from which OS user you are, you can authenticate as the hduser by simply doing kinit hduser You will be promted for the password of the hduser. Now you should be able to use HDFS as hduser. Note 1: Be prepared, that you will not have any permissions to create directories or write data unless you give these permission using the HDFS internal POSIX system or setting a corresponding policy in Apache Ranger. Note 2: If you use keytabs instead of passwords (and for the sake of clarity) it makes sense to create an OS user AND a Kerberos principal with the same name and give the OS user permissions on the keytab to that user only.
... View more
07-13-2017
10:27 PM
4 Kudos
This article is based on one of my blog posts. It is specifically about how to troubleshoot and debug an application behind Knox and ultimately get it up and running. Start Small
First try to access the service directly before you go over Knox. In many cases, there’s nothing wrong with your Knox setup, but with either the way you setup and configured the service behind Knox or the way you try to access that service.
When you are familiar on how to access your service directly and when you have verified that it works as intended, try to do the same call on Knox. Example:
You want to check if webhdfs is reachable so you first verify directly at the service and try to get the home directory of the service.
curl --negotiate -u : http://webhdfs-host.field.hortonworks.com:50070/webhdfs/v1/?op=GETHOMEDIRECTORY
If above request gives a valid 200 response and a meaningful answer you can safely check your Knox setup.
curl -k -u myUsername:myPassword https://knox-host.field.hortonworks.com:8443/gateway/default/webhdfs/v1/?op=GETHOMEDIRECTORY
Note: Direct access of WebHDFS and access of WebHDFS over Knox use two different authentication mechanisms: The first one uses SPNEGO which requires a valid Kerberos TGT in a secure cluster, if you don’t want to receive a “401 – Unauthorized” response. The latter one uses HTTP basic authentication against an LDAP, which is why you need to provide username and password on the command line.
Note 2: For the sake of completeness, I mention that here: Obviously, you direct the first request directly to the service host and port, while you direct your second request to the Knox host and port and specify which service.
The next section answers the question, what to do if the second command fails? (If the first command fails, go setup your service correctly and return later). Security Related Issues
So what do the HTTP response codes mean for a Knox application? Where to start?
Very common are “401 – Unauthorized”. This can be misleading, since 401 is always tied to authentication – not authorization. That means you need to probably check one of the following items. Which of these items causes the error can be found in the knox log (per default /var/log/knox/gateway.log )
Is your username password combination correct (LDAP)?
Is your username password combination in the LDAP you used?
Is your LDAP server running?
Is your LDAP configuration in the Knox topology correct (hostname, port, binduser, binduser password,…)?
Is your LDAP controller accessible through the firewall (ports 389 or 636 open from the Knox host)?
Note: Currently (in HDP 2.6), you can specify an alias for the binduser password. Make sure, that this alias is all lowercase. Otherwise you will get a 401 response as well.
If you got past the 401s, a popular response code is “403 – Unauthorized”. Now this has actually really something to do with authorization. Depending on if you use ACL authorization or Ranger Authorization (which is recommended) you go ahead differently. If you use ACLs, make sure that the user/group is authorized in your topology definition. If you use Ranger, check the Ranger audit log dashboard and you will immediately notice two possible error sources:
Your user/group is not allowed to use Knox.
Your user/group is not allowed to use the service that you want to access behind Knox.
Well, we came a long way and with respect to security we are almost done. One possible problem you could become is with impersonation. You need knox to be allowed to impersonate any user who access a service with knox. This is a configuration in core-site.xml: hadoop.proxyuser.knox.groups and hadoop.proxyuser.knox.hosts . Enter a comma separated list of groups and hosts that should be able to access a service over knox or set a wildcard * .
This is what you get in the Knox log, when your Ranger Admin server is not running and policies cannot be refreshed.
2017-07-05 21:11:53,700 ERROR util.PolicyRefresher (PolicyRefresher.java:loadPolicyfromPolicyAdmin(288)) - PolicyRefresher(serviceName=condlahdp_knox): failed to refresh policies. Will continue to use last known version of policies (3)
javax.ws.rs.ProcessingException: java.net.ConnectException: Connection refused (Connection refused)
This is also a nice example of Ranger’s design to not interfere with services if it’s down: policies will not be refreshed, but are still able operate as intended with the set of policies before Ranger crashed. Application Specific Issues
Once you are past the authentication and authorization issues, there might be issues with how Knox interacts with its applications. This section might grow with time. If you have more examples of application specific issues, leave a comment or send me an email. Hive:
To enable Hive working with Knox, you need to change the transport mode from binary to http. It might be necessary in rare cases to not only restart Hiveserver2 after this configuration change, but also the Knox gateway.
This is what you get when you don’t switch the transport mode from “binary” to “http”. Binary runs on port 10000, http runs on port 10001. When binary transport mode is still active Knox will try to connect to port 10001 which is not available and thus fails with “Connection refused”.
2017-07-05 08:24:31,508 WARN hadoop.gateway (DefaultDispatch.java:executeOutboundRequest(146)) - Connection exception dispatching request: http://condla0.field.hortonworks.com:10001/cliservice?doAs=user org.apache.http.conn.HttpHostConnectException: Connect to condla0.field.hortonworks.com:10001 [condla0.field.hortonworks.com/172.26.201.30] failed: Connection refused (Connection refused)
org.apache.http.conn.HttpHostConnectException: Connect to condla0.field.hortonworks.com:10001 [condla0.field.hortonworks.com/172.26.201.30] failed: Connection refused (Connection refused)
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353)
When you fixed all possible HTTP 401 errors for other services than Hive, but still get on in Hive, you might forget to pass username and password to beeline beeline -u "<jdbc-connection-string>" -n <username> -p <password> The correct jdbc-connection-string should have a format as in the example below: jdbc:hive2://$KNOX_HOSTNAME:$KNOX_PORT/default;ssl=true;sslTrustStore=$TRUSTSTORE_PATH;trustStorePassword=$TRUSTSTORE_SECRET;transportMode=http;httpPath=gateway/default/hive $TRUSTSTORE_PATH is the path to the truststore containing the knox server certificate, on the server with root access you could e.g. use /usr/hdp/current/knox-server/data/security/keystores/gateway.jks $KNOX_HOSTNAME is the hostname where the Knox instance is running $KNOX_PORT is the port exposed by Knox $TRUSTSTORE_SECRET is the secret you are using for your truststore
Now, this is what you get, when you connect via beeline trying to talk to Knox from a different (e.g. internal) hostname than the one configured in the ssl certificate of the server. Just change the hostname and everything will work fine. While this error is not specifically Hive related, you will most of the time encounter it in combination with Hive, since most of the other services don’t require you to check your certificates. Connecting to jdbc:hive2://knoxserver-internal.field.hortonworks.com:8443/;ssl=true;sslTrustStore=truststore.jks;trustStorePassword=myPassword;transportMode=http;httpPath=gateway/default/hive
17/07/06 12:13:37 [main]: ERROR jdbc.HiveConnection: Error opening session
org.apache.thrift.transport.TTransportException: javax.net.ssl.SSLPeerUnverifiedException: Host name 'knoxserver-internal.field.hortonworks.com' does not match the certificate subject provided by the peer (CN=knoxserver.field.hortonworks.com, OU=Test, O=Hadoop, L=Test, ST=Test, C=US)
HBase:
WEBHBASE is the service in a Knox topology to access HBase via the HBase REST server. Of course, a prerequisite is that the HBase REST server is up and running.
Even if it is up and running it can occur that you receive an Error with HTTP code 503. 503: Unavailable. This is not related to Knox. You can track down the issue to a HBase REST server related issue, in which the authenticated user does not have privileges to e.g. scan the data. Give the user the correct permissions to solve this error.
... View more
- Find more articles tagged with:
- Cloud & Operations
- Issue Resolution
- issue-resolution
- Knox
- operations
- Security
Labels:
06-22-2017
11:28 AM
Thanks @Jay SenSharma
... View more
06-22-2017
10:30 AM
@Jay SenSharma thanks. Yeah... forget that I upgraded yesterday. I have ambari-agent and ambari-server 2.5.1. ambari-infra-solr, ambari-metrics-collector,.... are still 2.5.0.3 As a workaround I uncommented the piece of code in the master.py file. Is there another possible solution?
... View more
06-22-2017
08:30 AM
Trying to start Zeppelin server from Ambari. This worked fine until once, when Ambari fails to start it with: Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/ZEPPELIN/0.6.0.2.5/package/scripts/master.py", line 450, in <module>
Master().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 329, in execute
method(env)
File "/var/lib/ambari-agent/cache/common-services/ZEPPELIN/0.6.0.2.5/package/scripts/master.py", line 227, in start
self.update_kerberos_properties()
File "/var/lib/ambari-agent/cache/common-services/ZEPPELIN/0.6.0.2.5/package/scripts/master.py", line 302, in update_kerberos_properties
and params.zookeeper_znode_parent not in interpreter['properties']['phoenix.url']:
KeyError: 'phoenix.url'
Didn't change any configs. Restart of Ambari server/agents does not work Ambari 2.5.0, HDP 2.6.0.3
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Zeppelin
05-30-2017
08:25 PM
To answer your question regarding Zookeeper. HBase needs Zookeeper. If you didn't set up Zookeeper yourself, HBase spins up an "internal" Zookeeper server, which is great for testing, but shouldn't be used in production scenarios.
... View more
05-30-2017
07:52 PM
@Sebastien Chausson Cool, glad to see that you got it up and running yourself! If my answer was helpful you can vote it up or mark it as best answer. 🙂
... View more
05-29-2017
05:56 AM
1 Kudo
Hi,
regarding your first bunch of questions: The answer depends on which distribution and versions you use or if you are using vanilla HBase. When you, e.g., install HDP 2.4, here is a guide to start the thrift server: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.3/bk_installing_manually_book/content/ref-2a6efe32-d0e1-4e84-9068-4361b8c36dc8.1.html
regarding your last question: the error message indicates, that you don't have the thrift module installed, that you will need on the client side to execute your python program.
Depending on how you manage packages, e.g., using pip you would need to install the thrift module: pip install thrift Doing so, at least this error message will disappear.
... View more
05-17-2017
06:52 AM
Thanks Mayank. Just wanted to add one thing to clarify for others who might have this problem, because I was wasting some time on this myself: To solve the SSLContext must not be null error, you correctly stated "distribute keystore and truststore file to all machines". I happened to only distribute them to all HBase Master nodes, but it's important to also deploy the same keystores to all region server machines.
... View more