About tom-kun

tom-kun · ‎10-02-2019

@lvazquez maybe you can directly execute a "kinit" to submit your user's credentials to your LDAP I manage to authenticate users from AD while the cluster is kerberorized through a FreeIPA Server. This is a command sample: %sh echo "password" | kinit foo@hortonworks.local hdfs dfs -ls / Found 12 items drwxrwxrwt - yarn hadoop 0 2019-10-02 13:53 /app-logs drwxr-xr-x - hdfs hdfs 0 2019-10-01 15:27 /apps drwxr-xr-x - yarn hadoop 0 2019-10-01 14:06 /ats drwxr-xr-x - hdfs hdfs 0 2019-10-01 14:08 /atsv2 drwxr-xr-x - hdfs hdfs 0 2019-10-01 14:06 /hdp drwx------ - livy hdfs 0 2019-10-02 11:35 /livy2-recovery drwxr-xr-x - mapred hdfs 0 2019-10-01 14:06 /mapred drwxrwxrwx - mapred hadoop 0 2019-10-01 14:08 /mr-history drwxrwxrwx - spark hadoop 0 2019-10-02 15:08 /spark2-history drwxrwxrwx - hdfs hdfs 0 2019-10-01 15:31 /tmp drwxr-xr-x - hdfs hdfs 0 2019-10-02 14:23 /user drwxr-xr-x - hdfs hdfs 0 2019-10-01 15:14 /warehouse I think this way is really ugly but at least, it is possible. Do not forget to change in your hdfs-site file the auth_to_local RULE:[1:$1@$0](.*@HORTONWORKS.LOCAL)s/@.*// RULE:[1:$1@$0](.*@IPA.HORTONWORKS.LOCAL)s/@.*//

tom-kun · ‎04-24-2019

I manage to retrieve the group named "ad_sshaccess_users" from the LDAP directory to the Ambari. But there is "0 member" inside this group. But in the Active Directory I created 2 users under this group mapped in the FreeIPA. Do you know if Ambari can retrieve AD users through a FreeIPA server which is doing the LDAP part? I'm not sure about that.

tom-kun · ‎10-22-2018

A solution to import your data as parquet file and be able to treat the TIMESTAMP and DATE format which come from RDBMS such as IBM DB2 or MySQL is to import using the sqoop import --as-parquet command and map each field using --map-column-java which are TIMESTAMP and DATE to a String Java type. After that, you should be able to interrogate the Hive database though a SparkSession by changing the configuration of the actual Spark Session and set spark.sql.hive.convertMetastoreParquet to false. SparkSQL will use the Hive SerDe for reading parquet tables instead of the built in support. spark.sql.hive.convertMetastoreParquet false import org.apache.spark.sql.SparkSession val sparkSession = SparkSession.builder() .appName("test interrogate Hive parquet file using Spark") .config("spark.sql.parquet.compression.codec", "snappy") .config("spark.sql.warehouse.dir","/apps/hive/warehouse") .config("hive.metastore.uris","thrift://sdsl-hdp-01.mycluster:9083") .config("spark.sql.hive.convertMetastoreParquet", false) .enableHiveSupport() .getOrCreate() import spark.implicits._ import spark.sql val df = sql("SELECT CAST(COL1 AS TIMESTAMP), COL2, COL3, CAST(COL4 AS TIMESTAMP), COL5 FROM db.mytable") df.printSchema root |-- COL1: timestamp (nullable = true) |-- COL2: string (nullable = true) |-- COL3: string (nullable = true) |-- COL4: timestamp (nullable = true) |-- COL5: integer (nullable = true) df.show(5, false) +--------------------------+--------+--------+--------------------------+------+ |COL1 |COL2 |COL3 |COL4 |COL5| +--------------------------+--------+--------+--------------------------+------+ |2003-01-01 00:00:00.100001| |00001 |2003-01-01 00:00:00.10361 |1 | |2003-01-01 00:00:00.100002| |00002 |2003-01-01 00:00:00.100002|2 | |2003-01-01 00:00:00.100003| |00003 |2003-01-01 00:00:00.100003|3 | |2003-01-01 00:00:00.100004| |00004 |2003-01-01 00:00:00.100004|4 | |2003-01-01 00:00:00.100005| |00005 |2003-01-01 00:00:00.100005|5 | +--------------------------+--------+--------+--------------------------+------+ only showing top 5 row

Online	Offline
Last Visited	‎11-28-2019 01:07 PM

Member Since	‎03-06-2017 02:14 PM
Last Visited	‎11-28-2019 01:07 PM
Posts	11
Kudos received	1

Cloudera Community

Re: Sqoop import table as parquet file then read i...

Re: How to enable user impersonation for SH interp...

Re: Sync AD users using FreeIPA LDAP with a trust ...

Re: Sqoop import table as parquet file then read i...