Member since
01-09-2017
33
Posts
0
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5805 | 03-22-2019 01:45 PM | |
1490 | 08-22-2018 03:00 PM | |
1228 | 03-12-2018 04:45 PM |
07-09-2018
01:53 PM
The solution we use is not perfect. Every project (tenant group) gets an input port that admins create on the root flow which we route into their PG. To prevent one tenant from accidentally writing onto someone else's input port we recommend they add a secret value attribute to their outgoing flowfiles and check for it via routeonattribute upon recieving flowfiles in their PG.
... View more
05-29-2018
08:13 PM
The Background: I have multitenant clusters. My organization has delivered me a combined truststore and keystore jks file. I will have users who need to hit various external-to-nifi services using SSL/TLS. It is tempting to create an sslcontext controller service at nifi root and let all my users use this service when they need SSL. One problem with this approach that I see is that if I let all my users use the hosts' certs, (keystore and truststore) they could just use a GetHTTP processor and talk to the nifi rest api with full priveleges (or at least whatever privs the node has) So to prevent this I figure that I should get my root CA certs into a separate truststore that is not password protected, and only use this. The question: Should I create an sslcontext service at the root flow with only a truststore, and ask my users to all use the same controller service? Or should I just notify my users of the path to the truststore and have them create controllers within their process groups as needed, what are the pros and cons of each approach?
... View more
Labels:
- Labels:
-
Apache NiFi
05-09-2018
05:31 PM
thanks @Matt Clarke, what would we do without you!?
... View more
05-09-2018
04:59 PM
Thanks @Matt Clarke, But if processor state alone cannot be used to handle primary node changes, how do processors like GenerateTableFetch work without a distributedMapCache service? Both ListSFTP and GenerateTableFetch mention in their docs that they store cluster-scoped state, but only ListSFTP can also make use of a cache service. What am I missing here?
... View more
05-09-2018
04:36 PM
I just noticed that ListSFTP can use a distributed cache controller. This is confusing to me because I thought we were supposed to only run ListSFTP on the primary node, and rebalance filenames via S2S RPG. In addition to the distributed cache, it also seems to store state. This is confusing to me because if we use a distributed cache controller, why would the ListSFTP need to store state? What is the current best practice for resiliant, parallelized sftp? If I use a distributed cache, does that mean I can just schedule my ListSFTP to run on all nodes? Can someone help me understand what is going on here? Thanks!
... View more
Labels:
- Labels:
-
Apache NiFi
03-12-2018
04:45 PM
@MattClarke answered this question in https://community.hortonworks.com/questions/176292/how-to-configure-managed-ranger-authorizer-for-nif.html
... View more
02-20-2018
03:45 PM
HDF 3.1 includes nifi 1.5, and the release notes mention that now external LDAP groups can be used in nifi security policies in ranger. It seems that we need to use org.apache.nifi.authorization.ManagedRangerAuthorizer in the authorizers xml but I cannot find any documentation on this. Has anyone successfully used LDAP groups for ranger nifi polices? And is there any documentation? Thanks. PS. I see @Yolanda M. Davis in the nifi git history for this feature, perhaps she can help?
... View more
Labels:
- Labels:
-
Apache NiFi
-
Apache Ranger
11-27-2017
04:27 PM
I will accept the answer because it seems this might be an issue on my side. Thank you I will open up a ticket with hortonworks support if further troubleshooting is needed after I take a look at the logs. thanks again.
... View more
11-20-2017
06:30 PM
@Ashutosh Mestry Thanks Ashuthosh, when I run the query: "hive_table where db.name like '*_final'" I get an error in the webui: Gremlin script execution failed: L:{def r=(([]) as Set);def f1={GremlinPipeline x->x.as('a0').out('__hive_table.db').as('__res') [0..<25].select(['a0', '__res']).fill(r)};f1(g.V().has('__typeName','hive_table'));f1(g.V().has('__superTypeNames','hive_table'));r._().as('__tmp').transform({((Row)it).getColumn('a0')}).as('a0').back('__tmp').transform({((Row)it).getColumn('__res')}).as('__res').filter({it.'Asset.name'.matches('.*_final')}).back('a0') [0..<25].toList()} We are running atlas 0.8.0.2, perhaps like clauses are unsupported on our version? I can use an equal sign and successfully retrieve tables in a certain database. do you know of a way to get the same information with a basic query?
... View more
11-17-2017
04:19 PM
I wish to programmatically query atlas to provide a list of hive tables that are in certain hive databases. I only want to see hive tables that are in databases that contain a certain string in their name. In the hive_table atlas type, the db property is a reference to an entity of type hive_db, so I cannot use a simple where clause. For example pretend I have many hive databases, some end with '_temp' some end with '_final'. Each database may have several tables. I want to generate a list of all hive tables in databases that end with '_final.' I would also like to exclude hive tables that have been deleted. I have been experimenting with using the /api/atlas/discovery/search/dsl rest endpoint, but I have had no success. There is documentation for the dsl at http://atlas.apache.org/Search.html, but this documentation is very esoteric, and I cannot figure out how to use it. Does anyone have examples of returning lists of entities in atlas bases on properties of referred-to entities? Is there a more user-friendly or complete source of documentation for the atlas query dsl? Also note that I do not wish to query the hive metastore directly, I wish to use atlas. Thank you for any help!
... View more
Labels:
- Labels:
-
Apache Atlas
- « Previous
-
- 1
- 2
- Next »