Member since
03-17-2016
15
Posts
6
Kudos Received
0
Solutions
05-29-2021
02:37 PM
@Onedile wrote: Yes this is possible. You need to kinit with the username that has been granted access to the SQL server DB and tables. integrated security passes your credentials to the SQL server using kerberos "jdbc:sqlserver://sername.domain.co.za:1433;integratedSecurity=true;databaseName=SCHEMA;authenticationScheme=JavaKerberos;" This worked for me. It doesn't work, it's still facing issue with the latest MSSQL JDBC driver as the Kerberos tokens are lost when the mappers spawn (as the YARN transitions the job to its internal security subsystem) 21/05/29 19:00:40 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1616335290043_2743822
21/05/29 19:00:40 INFO mapreduce.JobSubmitter: Executing with tokens: [Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:nameservice1, Ident: (token for c795701: HDFS_DELEGATION_TOKEN owner=c795701@XX.XXXX.XXXXXXX.COM, renewer=yarn, realUser=, issueDate=1622314832608, maxDate=1622919632608, sequenceNumber=29194128, masterKeyId=1856)]
21/05/29 19:01:15 INFO mapreduce.Job: Task Id : attempt_1616335290043_2743822_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.RuntimeException: com.microsoft.sqlserver.jdbc.SQLServerException: Integrated authentication failed. ClientConnectionId:53879236-81e7-4fc6-88b9-c7118c02e7be
Caused by: java.security.PrivilegedActionException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) Use the jtds driver as recommended here
... View more
07-25-2016
01:14 PM
1 Kudo
There are literally a dozen different options here: a) Did you enable SQL Optimization of SPSS ( requires the modeler server licence ) after that it can push tasks into the hive datasource. Not sure if Hive is a supported datasource but I would assume so. You can look into the documentation. https://www.ibm.com/support/knowledgecenter/SS3RA7_15.0.0/com.ibm.spss.modeler.help/sql_overview.htm b) SPSS also supports a set of UDFs for in database scoring but that is not what you want. c) Finally there is the SPSS Analytic Server which can essentially run most functions as an Mapreduce job on the cluster. ftp://public.dhe.ibm.com/software/analytics/spss/documentation/analyticserver/1.0/English/IBM_SPSS_Analytic_Server_1_Users_Guide.pdf Unfortunately if you neither have the Modeler Server licence nor analytic server there is not much you can do besides manually pushing prefilters into the hive database or optimizing your SPSS jobs more.
... View more
10-13-2017
03:41 PM
When we ingest the following data 2015-11-01 21:10:00.1
2015-11-01 21:10:00.1190011
2015-11-01 21:10:00.12
2015-11-01 21:10:00.123
2015-11-01 21:10:00.1234
2015-11-01 21:10:00.12345
2015-11-01 21:10:00.123456789
2015-11-01 21:10:00.490155
2015-11-01 21:10:00.1234567890
2015-11-01 21:10:00.1234567890123456789
I get the following when I do the "select", I get NULL for the last two rows instead of just truncating the additional digits. This is HDP 2.6.1 & Hive 1.2.1000 select * from test_timestamp;
+--------------------------------+--+
| test_timestamp.col |
+--------------------------------+--+
| 2015-11-01 21:10:00.1 |
| 2015-11-01 21:10:00.1190011 |
| 2015-11-01 21:10:00.12 |
| 2015-11-01 21:10:00.123 |
| 2015-11-01 21:10:00.1234 |
| 2015-11-01 21:10:00.12345 |
| 2015-11-01 21:10:00.123456789 |
| 2015-11-01 21:10:00.490155 |
| NULL |
| NULL |
... View more
03-29-2016
09:16 AM
You need to use hbase client api, it has changed a bit in HBase 1.x but the jist of it is the same https://www.slideshare.net/mobile/martyhall/hadoop-tutorial-hbase-part-3-java-client-api It is only good for small number of mutations, when you want to use asynchronous and batch mutations go with MapReduce as Predrag noted or use BufferedMutator as an example https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/BufferedMutator.html It is available in the examples of the HBase source code
... View more