About bwilson

bwilson · ‎08-18-2016

Like I said, I do not believe you can connect from Linux to SQLServer using IWA with the sqlsever jdbc driver. I recommend that you drop the jTDS driver in your Sqoop lib dir and try using that driver.

bwilson · ‎08-18-2016

Based on the error, it seems that you do not have a valid Kerberos ticket. Is the machine you are initiating the Sqoop job from integrated with your Windows AD via Kerberos?

bwilson · ‎08-17-2016

Hi @Andy Max, This should certainly be doable and should be relatively straight forward. In the end, I would recommend you stand up a small sandbox environment that mimics the current one and test out this process and develop a concrete playbook. The rough steps that I would recommend you try are: Stop all services on the existing cluster. Use Apache Ambari to install a "dummy" HDP 2.4.x cluster on the current cluster: http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Installing_HDP_AMB/content/ch_Getting_Ready.html Install barebones with only HDFS, ZK, etc. services. Make sure that the HDFS namenode and datanode directories are dummy directories that do not point to your existing data and namenode directories. Stop all services via Ambari. In Ambari, change the data and namenode directories for HDFS to point to your old directories. Start the services back up and verify that the data is available. This should work smoothly with HDP 2.4 because 2.4 also include Apache Hadoop 2.7.1 so the file system version is identical. This all assumes that you are only using HDFS. It could all get a bit hairier if you have Hive tables sitting on top with some metadata that needs to be migrated. Cheers, Brandon

bwilson · ‎08-17-2016

Hi @Khera What JDBC driver are you using to connect to SQL Server? The one provided by MS does not support WIndows authentication. That said, you can grab another driver that does support it. You have a couple options: Both Simba and Data Direct have drivers to support this authentication method. These have free trials but are ultimately going to require a license for repeated use. There is also jTDS which is free and open source and claims to support Windows Authentication so we can take it for a spin if you would like. You can see the rough JDBC URL that you would need is: <code>jdbc:jtds:sqlserver://123.123.123;instance=server1;databaseName=students;integratedSecurity=true;authenticationScheme=JavaKerberos

bwilson · ‎08-09-2016

Looks like you are missing the rpm-python package. If yum is working you can try re-installing rpm-python with yum. Alternatively, depending on the version of CentOS or RHEL that you are using, you can find the appropriate RPM in the OS archives. For example, CentOS 6 RPMs are here. Ctrl-f and search for rpm-python to find the package.

bwilson · ‎08-08-2016

Hi @Berk Ardıç, You can achieve this type of functionality by modifying a couple additional pieces of the flow. First, you can set the GetSFTP to search recursively from your mounted directory. This will traverse the entire path rooted at your target location so it will pick up files from Store1 and Store2 directories. You then have the ability to limit this by leveraging the regex filter properties for the path and the file. This will handle the pickup side of flow. Then, on the delivery side, you can leverage the path attribute from the flowfile to construct a new destination in HDFS that mirrors the structure from the pickup directory. You can use NiFi expression language in the destination for PutHDFS to construct the appropriate path. Hope this helps.

bwilson · ‎07-21-2016

I am aware that the NiFi bootstrap process can be configured to provide notifications in the event of a NiFi failure but I am wondering if we can configure the bootstrap process to not only detect and notify users of a failure but to also attempt to bring the NiFi process back to life? Am I missing an obvious config parameter in the guide?

bwilson · ‎07-15-2016

If I install a colocated instance of HBase with a YARN cluster (i.e., workers running NodeManager, Datanode, and RegionServer daemons) then does Ambari automatically adjust YARN memory to account for the HBase RegionServers or should I still tune the memory using the companion scripts?

bwilson · ‎05-23-2016

@Ancil McBarnett, I'll try to paraphrase but these guys do a better job explaining it than I ever will. Essentially, since you do not have direct access to port 10000 on the HS2 machine, you need to tunnel to a machine that does have access (i.e., any of the machines in the cluster) and then have that machine push your request to port 10000 on the HS2 machine. So, the command I wrote above creates a tunnel between my machine and "an_azure_node". Then, any connections to my localhost port 10000 will go across this tunnel and, on the other side, will be forwarded to port 10000 on the hs2 node. I hope that helps and clears things up.

bwilson · ‎05-23-2016

Hi @Ancil McBarnett, One option that I often use is to disable access to all ports form the outside world except 22 for SSH. Then set up a secure tunnel via SSH (requiring authentication at this stage) that forwards to port 10000 of the Hive Server. For example: ssh -L 10000:HS2_server_address:10000 user@an_azure_node Then you can point SQL workbench or any other tool to localhost:10000 and get forwarded across the tunnel and to port 10000 on the HS2 instance. I can provide more detail if you need. Note that this is really just putting a brick wall around the cluster requiring authentication. If you also want authorization then we'd need to address it another way.

Online	Offline
Last Visited	‎02-15-2022 08:21 PM

Member Since	‎09-14-2015 08:02 PM
Last Visited	‎02-15-2022 08:21 PM
Posts	79
Kudos received	88

Cloudera Community

Re: Install HDP 2.4.2 on my Dual Core, 8GB RAM Win...

Re: HDCloud datanodes unmount after restart of ser...

Re: How to extract first 5 record from flow file u...

Re: Export/Import HDFS snapshots

Re: journal node edit log issue

Re: SQLServer Sqoop Import Integrated Security Ker...

Re: SQLServer Sqoop Import Integrated Security Ker...

Re: In-place update from Apache Hadoop 2.7 deploym...

Re: SQLServer Sqoop Import Integrated Security Ker...

Re: Failed to install HDP 2.3 using ambari

Re: NiFi equivalent to flume spooling directory so...

Can the NiFi bootstrap process be configured to re...

Does Ambari automatically adjust memory settings f...

Re: How can I secure my Hive access on Azure Envir...

Re: How can I secure my Hive access on Azure Envir...