Member since
09-14-2015
79
Posts
91
Kudos Received
22
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2313 | 01-25-2017 04:43 PM | |
1786 | 11-23-2016 05:56 PM | |
5735 | 11-11-2016 02:44 AM | |
1540 | 10-26-2016 01:50 AM | |
9438 | 10-19-2016 10:22 PM |
08-18-2016
02:40 PM
Like I said, I do not believe you can connect from Linux to SQLServer using IWA with the sqlsever jdbc driver. I recommend that you drop the jTDS driver in your Sqoop lib dir and try using that driver.
... View more
08-18-2016
12:09 PM
Based on the error, it seems that you do not have a valid Kerberos ticket. Is the machine you are initiating the Sqoop job from integrated with your Windows AD via Kerberos?
... View more
08-17-2016
08:40 PM
1 Kudo
Hi @Andy Max, This should certainly be doable and should be relatively straight forward. In the end, I would recommend you stand up a small sandbox environment that mimics the current one and test out this process and develop a concrete playbook. The rough steps that I would recommend you try are:
Stop all services on the existing cluster. Use Apache Ambari to install a "dummy" HDP 2.4.x cluster on the current cluster: http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Installing_HDP_AMB/content/ch_Getting_Ready.html
Install barebones with only HDFS, ZK, etc. services. Make sure that the HDFS namenode and datanode directories are dummy directories that do not point to your existing data and namenode directories. Stop all services via Ambari. In Ambari, change the data and namenode directories for HDFS to point to your old directories. Start the services back up and verify that the data is available. This should work smoothly with HDP 2.4 because 2.4 also include Apache Hadoop 2.7.1 so the file system version is identical. This all assumes that you are only using HDFS. It could all get a bit hairier if you have Hive tables sitting on top with some metadata that needs to be migrated. Cheers, Brandon
... View more
08-17-2016
08:19 PM
1 Kudo
Hi @Khera What JDBC driver are you using to connect to SQL Server? The one provided by MS does not support WIndows authentication. That said, you can grab another driver that does support it. You have a couple options:
Both Simba and Data Direct have drivers to support this authentication method. These have free trials but are ultimately going to require a license for repeated use.
There is also jTDS which is free and open source and claims to support Windows Authentication so we can take it for a spin if you would like. You can see the rough JDBC URL that you would need is: <code>jdbc:jtds:sqlserver://123.123.123;instance=server1;databaseName=students;integratedSecurity=true;authenticationScheme=JavaKerberos
... View more
08-09-2016
03:26 PM
1 Kudo
Looks like you are missing the rpm-python package. If yum is working you can try re-installing rpm-python with yum. Alternatively, depending on the version of CentOS or RHEL that you are using, you can find the appropriate RPM in the OS archives. For example, CentOS 6 RPMs are here. Ctrl-f and search for rpm-python to find the package.
... View more
08-08-2016
03:14 PM
Hi @Berk Ardıç, You can achieve this type of functionality by modifying a couple additional pieces of the flow. First, you can set the GetSFTP to search recursively from your mounted directory. This will traverse the entire path rooted at your target location so it will pick up files from Store1 and Store2 directories. You then have the ability to limit this by leveraging the regex filter properties for the path and the file. This will handle the pickup side of flow. Then, on the delivery side, you can leverage the path attribute from the flowfile to construct a new destination in HDFS that mirrors the structure from the pickup directory. You can use NiFi expression language in the destination for PutHDFS to construct the appropriate path. Hope this helps.
... View more
07-21-2016
01:34 PM
I am aware that the NiFi bootstrap process can be configured to provide notifications in the event of a NiFi failure but I am wondering if we can configure the bootstrap process to not only detect and notify users of a failure but to also attempt to bring the NiFi process back to life? Am I missing an obvious config parameter in the guide?
... View more
Labels:
- Labels:
-
Apache NiFi
07-15-2016
06:18 PM
If I install a colocated instance of HBase with a YARN cluster (i.e., workers running NodeManager, Datanode, and RegionServer daemons) then does Ambari automatically adjust YARN memory to account for the HBase RegionServers or should I still tune the memory using the companion scripts?
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache HBase
-
Apache YARN
05-23-2016
09:43 PM
@Ancil McBarnett, I'll try to paraphrase but these guys do a better job explaining it than I ever will. Essentially, since you do not have direct access to port 10000 on the HS2 machine, you need to tunnel to a machine that does have access (i.e., any of the machines in the cluster) and then have that machine push your request to port 10000 on the HS2 machine. So, the command I wrote above creates a tunnel between my machine and "an_azure_node". Then, any connections to my localhost port 10000 will go across this tunnel and, on the other side, will be forwarded to port 10000 on the hs2 node. I hope that helps and clears things up.
... View more
05-23-2016
07:32 PM
1 Kudo
Hi @Ancil McBarnett, One option that I often use is to disable access to all ports form the outside world except 22 for SSH. Then set up a secure tunnel via SSH (requiring authentication at this stage) that forwards to port 10000 of the Hive Server. For example: ssh -L 10000:HS2_server_address:10000 user@an_azure_node Then you can point SQL workbench or any other tool to localhost:10000 and get forwarded across the tunnel and to port 10000 on the HS2 instance. I can provide more detail if you need. Note that this is really just putting a brick wall around the cluster requiring authentication. If you also want authorization then we'd need to address it another way.
... View more