Member since
12-21-2015
57
Posts
7
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4060 | 08-25-2016 09:31 AM |
03-16-2017
08:34 AM
Do you have this issue resolved? Can you tell me your solution if you manage to solve it?
... View more
03-15-2017
06:27 AM
We used the InvokeHTTP processor in NiFi on Salesforce REST API (POST action, to effect changes in customer data). Salesforce requires authentication in using of its REST endpoints. Is the InvokeHTTP the right processor for this requirement? How do we maintain authenticated connection / session to REST API?
... View more
Labels:
- Labels:
-
Apache NiFi
03-14-2017
08:53 AM
1 Kudo
We have a number of databases, mostly in Oracle and MS SQL Server, which were designed prior without timestamp fields; as a result, Sqoop cannot be used for incremental batch load. In addition, some real-time use-case requirements forced us to look into streaming solutions for Change Data Capture. We POC'd a database for each. To interface with the databases and get the changed data (or delta), for Oracle, we used XStream (11g and before) and for MS SQL Server, we enabled CDC tables. We wrote custom Java code and transform the delta units (using GSON) to JSON String and use that as payload to Kafka Producer, which eventually will be consumed by/into our HDP cluster as changes to corresponding Hive database/tables. To make our solution maintainable, we are switching to NiFi, but as we are new to this technology, we are still in research stage. Can anyone propose a NiFi version solution something similar to what we've done above (interface with CDC mechanism of Oracle, then delta units to JSON, then produce in Kafka, then update Hive tables)? What are the processors to be used?
... View more
Labels:
- Labels:
-
Apache Kafka
-
Apache NiFi
02-01-2017
07:50 AM
It works now when I use "jdbc:hive2://<my-vm-hostname>:8443/;ssl=true...". However, when I use 'localhost', '127.0.0.1' and even <my-vm-ip>, I get this message: 17/02/01 15:38:51 [main]: ERROR jdbc.HiveConnection: Error opening session
org.apache.thrift.transport.TTransportException: javax.net.ssl.SSLPeerUnverifiedException: Host name '<my-vm-ip>' does not match the certificate subject provided by the peer (CN=<my-vm-hostname>, OU=Test, O=Hadoop, L=Test, ST=Test, C=US)
at org.apache.thrift.transport.THttpClient.flushUsingHttpClient(THttpClient.java:297)
...
Caused by: javax.net.ssl.SSLPeerUnverifiedException: Host name '<my-vm-ip>' does not match the certificate subject provided by the peer (CN=<my-vm-hostname>, OU=Test, O=Hadoop, L=Test, ST=Test, C=US)
at org.apache.http.conn.ssl.SSLConnectionSocketFactory.verifyHostname(SSLConnectionSocketFactory.java:465)
...
... 30 more
Error: Could not establish connection to jdbc:hive2://<my-vm-ip>:8443/;ssl=true;sslTrustStore=/var/lib/knox/data-2.5.0.0-1245/security/keystores/gateway.jks;trustStorePassword=knox_123;transportMode=http;httpPath=gateway/default/hive: javax.net.ssl.SSLPeerUnverifiedException: Host name '<my-vm-ip>' does not match the certificate subject provided by the peer (CN=<my-vm-hostname>, OU=Test, O=Hadoop, L=Test, ST=Test, C=US) (state=08S01,code=0)
17/02/01 15:38:51 [main]: ERROR jdbc.HiveConnection: Error opening session
org.apache.thrift.transport.TTransportException: javax.net.ssl.SSLPeerUnverifiedException: Host name '<my-vm-ip>' does not match the certificate subject provided by the peer (CN=<my-vm-hostname>, OU=Test, O=Hadoop, L=Test, ST=Test, C=US)
at org.apache.thrift.transport.THttpClient.flushUsingHttpClient(THttpClient.java:297)
...
Caused by: javax.net.ssl.SSLPeerUnverifiedException: Host name '<my-vm-ip>' does not match the certificate subject provided by the peer (CN=<my-vm-hostname>, OU=Test, O=Hadoop, L=Test, ST=Test, C=US)
at org.apache.http.conn.ssl.SSLConnectionSocketFactory.verifyHostname(SSLConnectionSocketFactory.java:465)
...
Caused by: javax.net.ssl.SSLPeerUnverifiedException: Host name '<my-vm-ip>' does not match the certificate subject provided by the peer (CN=<my-vm-hostname>, OU=Test, O=Hadoop, L=Test, ST=Test, C=US)
at org.apache.http.conn.ssl.SSLConnectionSocketFactory.verifyHostname(SSLConnectionSocketFactory.java:465)
...
at org.apache.thrift.transport.THttpClient.flushUsingHttpClient(THttpClient.java:251)
... 24 more
Error: Could not establish connection to jdbc:hive2://<my-vm-ip>:8443/;ssl=true;sslTrustStore=/var/lib/knox/data-2.5.0.0-1245/security/keystores/gateway.jks;trustStorePassword=knox_123;transportMode=http;httpPath=gateway/default/hive: javax.net.ssl.SSLPeerUnverifiedException: Host name '<my-vm-ip>' does not match the certificate subject provided by the peer (CN=<my-vm-hostname>, OU=Test, O=Hadoop, L=Test, ST=Test, C=US) (state=08S01,code=0)
Is there anyway to connect using <my-vm-ip> in order for a remote machine to access it? The <my-vm-hostname> is not network visible, it's just locally known within the VM's /etc/hosts. Thanks for all your help. Appreciate it.
... View more
01-30-2017
11:36 AM
Alright the HDFS part seems to be working, but I have to use guest:guest-password I thought it is something I can add to Unix and Ranger as users (which I named guest as well, with a different password; which is why it won't work). If I understand correctly, this guest should be a Knox specific username, right? But how about Hive? It still won't work, even though I used guest:guest-password. I am getting the same result as the original. Thanks for your help.
... View more
01-30-2017
11:07 AM
I already pasted the default.xml above. I tried using the admin:admin-password (earlier I didn't, as I was confused) : curl -iku admin:admin-password -X GET 'https://localhost:8443/gateway/admin/api/v1/version' and I now get HTTP/1.1 200 OK
Date: Mon, 30 Jan 2017 11:04:24 GMT
Set-Cookie: JSESSIONID=1qfuugnqxqf1hf4exm5kpvkmd;Path=/gateway/admin;Secure;HttpOnly
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Set-Cookie: rememberMe=deleteMe; Path=/gateway/admin; Max-Age=0; Expires=Sun, 29-Jan-2017 11:04:24 GMT
Content-Type: application/xml
Content-Length: 170
Server: Jetty(9.2.15.v20160210)
<?xml version="1.0" encoding="UTF-8"?>
<ServerVersion>
<version>0.9.0.2.5.0.0-1245</version>
<hash>09990487b383298f8e1c9e72dceb0a8e3ff33d17</hash>
</ServerVersion>
... View more
01-30-2017
10:55 AM
Then there is something wrong with Knox tutorial in Hortonworks. The knox_sample.xml file does not exist even in the Sandbox VM and using touch will simply create an empty file. What would be the admin username and password here? Is it Ranger's or Knox's admin? Here's the output of ps aux | grep ldap : knox 562103 0.0 0.2 5400580 33376 ? Sl Jan25 2:59 /usr/lib/jvm/java-1.7.0-oracle/bin/java -jar /usr/hdp/current/knox-server/bin/ldap.jar /usr/hdp/current/knox-server/conf
root 675001 0.0 0.0 103320 868 pts/0 S+ 18:32 0:00 grep ldap
And here's the default.xml : <topology>
<gateway>
<provider>
<role>authentication</role>
<name>ShiroProvider</name>
<enabled>true</enabled>
<param>
<name>sessionTimeout</name>
<value>30</value>
</param>
<param>
<name>main.ldapRealm</name>
<value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value>
</param>
<param>
<name>main.ldapRealm.userDnTemplate</name>
<value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.url</name>
<value>ldap://<my-vm-hostname>.com:33389</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.authenticationMechanism</name>
<value>simple</value>
</param>
<param>
<name>urls./**</name>
<value>authcBasic</value>
</param>
</provider>
<provider>
<role>identity-assertion</role>
<name>Default</name>
<enabled>true</enabled>
</provider>
<provider>
<role>authorization</role>
<name>XASecurePDPKnox</name>
<enabled>true</enabled>
</provider>
</gateway>
<service>
<role>NAMENODE</role>
<url>hdfs://<my-vm-hostname>:8020</url>
</service>
<service>
<role>JOBTRACKER</role>
<url>rpc://<my-vm-hostname>:8050</url>
</service>
<service>
<role>WEBHDFS</role>
<url>http://<my-vm-hostname>:50070/webhdfs</url>
</service>
<service>
<role>WEBHCAT</role>
<url>http://<my-vm-hostname>:50111/templeton</url>
</service>
<service>
<role>OOZIE</role>
<url>http://<my-vm-hostname>:11000/oozie</url>
</service>
<service>
<role>WEBHBASE</role>
<url>http://<my-vm-hostname>:8080</url>
</service>
<service>
<role>HIVE</role>
<url>http://<my-vm-hostname>:10001/cliservice</url>
</service>
<service>
<role>RESOURCEMANAGER</role>
<url>http://<my-vm-hostname>:8088/ws</url>
</service>
</topology>
... View more
01-30-2017
10:04 AM
The knox_sample topology seems to be empty based from the tutorial. The guide just instructs the user to use touch command. The result of the command you gave is: HTTP/1.1 401 Unauthorized
Date: Mon, 30 Jan 2017 09:56:48 GMT
Set-Cookie: rememberMe=deleteMe; Path=/gateway/admin; Max-Age=0; Expires=Sun, 29-Jan-2017 09:56:48 GMT
WWW-Authenticate: BASIC realm="application"
Content-Length: 0
Server: Jetty(9.2.15.v20160210)
... View more
01-30-2017
08:38 AM
I've been following the Hortonworks tutorial on all things
related to Knox.
tutorial-420: http://hortonworks.com/hadoop-tutorial/securing-hadoop-infrastructure-apache-knox/ tutorial-560: http://hortonworks.com/hadoop-tutorial/secure-jdbc-odbc-clients-access-hiveserver2-using-apache-knox/ Our goal is use Knox as a gateway or single point of entry for
microservices that intend to connect to the cluster using REST API calls. However, my Hadoop environment is not the HDP 2.5 Sandbox but an
HDP 2.5 stack built through Ambari on a single node Azure VM, so the
configurations may differ and that should partially explain why the tutorials
may not work. (This is a POC for a multinode cluster build) I’m
testing the WebHDFS and Hive part. For Ranger, I created a temporary guest
account that has global access to HDFS and Hive, as well as copying the
GlobalKnoxAllow configured in the Sandbox VM (I'll take care of the more fine-grained Ranger ACL later). We didn't setup this security in LDAP mode though, just the plain Unix ACL. I also created a sample database
named microservice, which I can connect to and query via beeline. From tutorial-420, Step 1: I started the Start Demo LDAP Step 2: touch /usr/hdp/current/knox-server/conf/topologies/knox_sample.xml
curl -iku guest:<guestpw> -X GET 'http://<my-vm-ip>:50070/webhdfs/v1/?op=LISTSTATUS' This is what I get: HTTP/1.1 404 Not Found
Cache-Control: no-cache
Expires: Mon, 30 Jan 2017 07:23:17 GMT
Date: Mon, 30 Jan 2017 07:23:17 GMT
Pragma: no-cache
Expires: Mon, 30 Jan 2017 07:23:17 GMT
Date: Mon, 30 Jan 2017 07:23:17 GMT
Pragma: no-cache
Content-Type: application/json
X-FRAME-OPTIONS: SAMEORIGIN
Transfer-Encoding: chunked
Server: Jetty(6.1.26.hwx)
{"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File /webhdfs/v1 does not exist."}}
Step 3: curl -iku guest:<guestpw> -X GET 'https://<my-vm-ip>:8443/gateway/default/webhdfs/v1/?op=LISTSTATUS' This is what I get: HTTP/1.1 401 Unauthorized
Date: Mon, 30 Jan 2017 07:25:51 GMT
Set-Cookie: rememberMe=deleteMe; Path=/gateway/default; Max-Age=0; Expires=Sun, 29-Jan-2017 07:25:51 GMT
WWW-Authenticate: BASIC realm="application"
Content-Length: 0
Server: Jetty(9.2.15.v20160210)
I jumped to tutorial-560: Step 1: Knox is started Step 2: In Ambari, hive.server2.transport.mode is changed to http from binary and Hive
Server 2 is restarted. Step 3: SSH on my Azure VM Step 4: Connect to Hive Server 2
using beeline via Knox [root@poc2 hive]# beeline
!connect jdbc:hive2:// <my-vm-ip>:8443/microservice;ssl=true;sslTrustStore=/var/lib/knox/data-2.5.0.0-1245/security/keystores/gateway.jks;trustStorePassword=knox?hive.server2.transport.mode=http;hive.server2.thrift.http.path=gateway/default/hive This is the result: Connecting to jdbc:hive2://<my-vm-ip>:8443/microservice;ssl=true;sslTrustStore=/var/lib/knox/data-2.5.0.0-1245/security/keystores/gateway.jks;trustStorePassword=knox?hive.server2.transport.mode=http;hive.server2.thrift.http.path=gateway/default/hive
Enter username for jdbc:hive2://... : guest
Enter password for jdbc:hive2://... : ********
17/01/30 15:28:21 [main]: WARN jdbc.Utils: ***** JDBC param deprecation *****
17/01/30 15:28:21 [main]: WARN jdbc.Utils: The use of hive.server2.transport.mode is deprecated.
17/01/30 15:28:21 [main]: WARN jdbc.Utils: Please use transportMode like so: jdbc:hive2://<host>:<port>/dbName;transportMode=<transport_mode_value>
17/01/30 15:28:21 [main]: WARN jdbc.Utils: ***** JDBC param deprecation *****
17/01/30 15:28:21 [main]: WARN jdbc.Utils: The use of hive.server2.thrift.http.path is deprecated.
17/01/30 15:28:21 [main]: WARN jdbc.Utils: Please use httpPath like so: jdbc:hive2://<host>:<port>/dbName;httpPath=<http_path_value>
Error: Could not create an https connection to jdbc:hive2://<my-vm-ip>:8443/microservice;ssl=true;sslTrustStore=/var/lib/knox/data-2.5.0.0-1245/security/keystores/gateway.jks;trustStorePassword=knox?hive.server2.transport.mode=http;hive.server2.thrift.http.path=gateway/default/hive. Keystore was tampered with, or password was incorrect (state=08S01,code=0)
0: jdbc:hive2://<my-vm-ip>:8443/microservi (closed)>
The same thing I get even if replace
hive.server2.thrift.http.path=cliserver (based from hive-site.xml). In hive-site.xml, ssl=false, so I tried substituting that on the JDBC connection URL: [root@poc2 hive]# beeline
!connect jdbc:hive2:// <my-vm-ip>:8443/microservice;ssl=false;sslTrustStore=/var/lib/knox/data-2.5.0.0-1245/security/keystores/gateway.jks;trustStorePassword=knox?hive.server2.transport.mode=http;hive.server2.thrift.http.path=gateway/default/hive This is the result: Connecting to jdbc:hive2://<my-vm-ip>:8443/microservice;ssl=false;sslTrustStore=/var/lib/knox/data-2.5.0.0-1245/security/keystores/gateway.jks;trustStorePassword=knox?hive.server2.transport.mode=http;hive.server2.thrift.http.path=gateway/default/hive
Enter username for jdbc:hive2://... : guest
Enter password for jdbc:hive2://... : ********
17/01/30 15:33:29 [main]: WARN jdbc.Utils: ***** JDBC param deprecation *****
17/01/30 15:33:29 [main]: WARN jdbc.Utils: The use of hive.server2.transport.mode is deprecated.
17/01/30 15:33:29 [main]: WARN jdbc.Utils: Please use transportMode like so: jdbc:hive2://<host>:<port>/dbName;transportMode=<transport_mode_value>
17/01/30 15:33:29 [main]: WARN jdbc.Utils: ***** JDBC param deprecation *****
17/01/30 15:33:29 [main]: WARN jdbc.Utils: The use of hive.server2.thrift.http.path is deprecated.
17/01/30 15:33:29 [main]: WARN jdbc.Utils: Please use httpPath like so: jdbc:hive2://<host>:<port>/dbName;httpPath=<http_path_value>
17/01/30 15:34:32 [main]: ERROR jdbc.HiveConnection: Error opening session
org.apache.thrift.transport.TTransportException: org.apache.http.conn.ConnectTimeoutException: Connect to <my-vm-ip>:8443 [/<my-vm-ip>] failed: Connection timed out
at org.apache.thrift.transport.THttpClient.flushUsingHttpClient(THttpClient.java:297)
at org.apache.thrift.transport.THttpClient.flush(THttpClient.java:313)
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73)
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)
at org.apache.hive.service.cli.thrift.TCLIService$Client.send_OpenSession(TCLIService.java:154)
at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:146)
at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:552)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:170)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:187)
at org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:146)
at org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:211)
at org.apache.hive.beeline.Commands.connect(Commands.java:1190)
at org.apache.hive.beeline.Commands.connect(Commands.java:1086)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:52)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:989)
at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:832)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:790)
at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:490)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:473)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to <my-vm-ip>:8443 [/<my-vm-ip>] failed: Connection timed out
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:156)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353)
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.execchain.ServiceUnavailableRetryExec.execute(ServiceUnavailableRetryExec.java:84)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:117)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
at org.apache.thrift.transport.THttpClient.flushUsingHttpClient(THttpClient.java:251)
... 30 more
Caused by: java.net.ConnectException: Connection timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:74)
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:141)
... 41 more
Error: Could not establish connection to jdbc:hive2://<my-vm-ip>:8443/microservice;ssl=false;sslTrustStore=/var/lib/knox/data-2.5.0.0-1245/security/keystores/gateway.jks;trustStorePassword=knox?hive.server2.transport.mode=http;hive.server2.thrift.http.path=gateway/default/hive: org.apache.http.conn.ConnectTimeoutException: Connect to <my-vm-ip>:8443 [/<my-vm-ip>] failed: Connection timed out (state=08S01,code=0)
0: jdbc:hive2://<my-vm-ip>:8443/microservi (closed)>
Again I tried replacing hive.server2.thrift.http.path=cliserver and I get the same result. Does anyone here able to configure to Knox correctly and working?
... View more
Labels:
12-06-2016
08:35 AM
We're planning to add Apache HAWQ / Pivotal HDB on our Hortonworks stack. Can this be secured with Ranger? If not, how do we have ACL or security audit in HAWQ / HDB ?
... View more
Labels:
- Labels:
-
Apache Ranger