About Shelton

Shelton · ‎09-25-2019

@parthk You can definitely use sentry for RBAC type of style in Impala you don't really need Kerberos but it's highly advised to have Kerberos why??? If you know historical sentry has been the weakest link in the security architecture of Cloudera that's the reason it was dropped in favor of Ranger in the upcoming new offering CDP. Having said that sentry role-based access Control (RBAC) is an approach to restricting system access to authorized users whereas Kerberos using keytabs is like a biometric passport where the password is only know to the keytab and principal that allows a process (a client) running on behalf of a principal (a user) to prove its identity to a verifier (an application server, or just server) without sending data across the network that might allow an attacker or the verifier to subsequently impersonate the principal. Kerberos optionally provides integrity and confidentiality for data sent between the client and the server. You can safely build your cluster without Kerberos especially for self-study and development but not for production. There are 2 types of Kerberos setup MIT and AD Active Directory is a directory services implementation that provides all sorts of functionality like authentication, group and user management, policy administration and more in a centralized manner. LDAP (Lightweight Directory Access Protocol) is an open and cross-platform protocol used for directory services authentication hence the pointer in the Cloudera documentation to use LDAP/LDAPS HTH Happy hadooping

Shelton · ‎09-25-2019

@elmismo999 Sqoop uses Mapreduce so make sure it's running and YARN then secondly you first validate that the database and table exist, follow the below steps # mysql -u root -p[root_password] mysql>show databases; If the sqoop database exists then run mysql> use sqoop; mysql> show tables; This MUST show the table result if it doesn't then your export cannot work and export command I don't see the MySQL database port default 3306 and the root password place holder -P or simple -p[root_password] # sqoop import --connect jdbc:mysql://127.0.0.1:3306/sqoop --username root -P --table result --target-dir /user/results10/ Can you confirm the above and revert

Shelton · ‎09-21-2019

@rvillanueva To add on to what @jsensharma commented it's always a good idea to have separate databases for druid and superset! In case you get some issues then you have only a component's data in jeopardy 🙂

Shelton · ‎09-21-2019

@jhc I have downloaded a HDP 3.0 sandbox to try get aroung your problem. After suceesful deployment on Virtual box When you access DAS the default user is hive see screenshot DAS default user and on the beeline too [root@sandbox-hdp ~]# su - hive Last login: Sat Sep 21 08:07:40 UTC 2019 [hive@sandbox-hdp ~]$ hive Connecting to jdbc:hive2://sandbox-hdp.hortonworks.com:2181/default;password=hive;serviceDiscoveryMode=zooKeeper;user=hive;zooKeeperNamespace=hiveserver2 19/09/21 08:38:06 [main]: INFO jdbc.HiveConnection: Connected to sandbox-hdp.hortonworks.com:10000 Connected to: Apache Hive (version 3.1.0.3.0.1.0-187) Driver: Hive JDBC (version 3.1.0.3.0.1.0-187) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 3.1.0.3.0.1.0-187 by Apache Hive 0: jdbc:hive2://sandbox-hdp.hortonworks.com:2> show databases; INFO : Completed executing command(queryId=hive_20190921083816_c231488f-8a6f-4fb1-bdfb-48493a3cb98e); Time taken: 0.064 seconds INFO : OK +--------------------------+ | database_name | +--------------------------+ | default | | foodmart | | information_schema | | sys | +-------------------------+ 4 rows selected (0.443 seconds) Now to demonstrate I will create a new database JHC using DAS as the default user hive [See Create DB with DAS], You will see warning waiting for the database to be created it should succeed and now it should be available in the drop-down list see [JHC database] Now choose this database to populate it with tables sample table Cloudera [see create_table_cloudera] The DAS view should update with the table [cloudera ] in [ jhc] database While the beeline also proves the successful creation of the database and table therein 0: jdbc:hive2://sandbox-hdp.hortonworks.com:2> show databases; INFO : OK +-------------------------+ | database_name | +-------------------------+ | default | | foodmart | | information_schema | | jhc | | sys | +---------------------+ 5 rows selected (0.092 seconds) 0: jdbc:hive2://sandbox-hdp.hortonworks.com:2> use jhc; 0: jdbc:hive2://sandbox-hdp.hortonworks.com:2> show tables; +-----------+ | tab_name | +-----------+ | cloudera | +-----------+ 0: jdbc:hive2://sandbox-hdp.hortonworks.com:2> describe cloudera; INFO : OK +-------------+-------------+------------+ | col_name | data_type | comment | +-------------+-------------+------------+ | id | tinyint | | | username | tinyint | | | position | tinyint | | | dept | tinyint | | +------------+-------------+-------------+ 4 rows selected (0.173 seconds) DAS view now shows the new table Cloudera in JHC database. Now getting back to your question "When you query the databases the table doesn't appear in the drop-down list " Are you sure you selected the same DB where you created the table?" I realized I had to wait for the refresh maybe you should manually refresh 🙂 Hope that helps, please revert

Shelton · ‎09-17-2019

@psilvarochagome In this community, we share knowledge to advance the Cloudera community and don't get cash for that! though some are real production issues, having said that it's unfortunate people like you got a solution to a problem being faced by a member and don't want to share as requested by @slim_abderrahim It's very unfortunate I hope member see this and tag you ... .........we open-source as opposed to proprietary code. 🙂 Happy hadooping

Shelton · ‎09-17-2019

@ranger What is your HDP version? I this you have hit this bug despite not matching the Ranger version. Try the workaround and revert https://issues.apache.org/jira/browse/RANGER-1342

Shelton · ‎09-17-2019

@mike_bronson7 For sure that is the last major version of HDP but there could be minor release to correct bugs, Cloudera is keen to release the CDP sometime before December according to insiders. Here is a link to a webinar available on-demand, you will need to register to view it. Here is a preview of what you should expect the best of 2 the worlds (Hortonworks & Cloudera) Cloudera Streams Management is GA so CDP should be round the corner This is my personal view and not that of Cloudera I

Shelton · ‎09-14-2019

@ThriftTran I don't know how you expect any member to help on a subject with no context. The least you could do is provide some logs, screenshots, description of the environment or components etc

Shelton · ‎09-14-2019

@ranger Can you try something like this it explains how to connect Hive running on a remote host (HiveSever2) using commonly used Python package, Pyhive? There are a lot of other Python packages available to connect to remote Hive, but Pyhive package is one of the easy and well-maintained and supported package Here I am assuming you installed already the Pyhive package if not please do that first! from phive import hive import re,os, time host_name = "localhost" port = 10001 user = "hive" password ="hive" database = "employeeDB" def hiveconnection(host_name, port, user,password, database): conn = hive.Connection(host=host_name, port=port, username=user, password=password, database=database, auth='CUSTOM') cur = conn.cursor() cur.execute('select * from employees returns limit 5') result = cur.fetchall() return result # Call above function output = hiveconnection(host_name, port, user,password, database) print(output) Before you attempt to connect using Pyhive you should execute the below steps to install the Pyhive package below are the step on an ubuntu machine as Pyhive is dependent on these Modules: Installing gcc sudo apt-get install gcc Install Thrift pip install thrift+ Install SASL pip install sasl Install thrift sasl pip install thrift_sasl After the above steps have run successfully, you can go ahead and install Pyhive using pip: pip install pyhive should you encounter Pyhive sasl fatal error install the below dependencies. sudo apt-get install libsasl2-dev Now you can re-test your hive database connection Please let me know

Shelton · ‎09-13-2019

@budati There is a good response by Burgess that should work out even for you CSV with duplicate headers

Online	Offline
Last Visited	‎06-05-2025 02:03 PM

Member Since	‎01-19-2017 04:35 AM
Last Visited	‎06-05-2025 02:03 PM
Posts	3,676
Kudos received	627

Cloudera Community

Re: Apache nifi memory consumption in kubernetes

Re: Nifi toolkit command for GitLabFlowRegistry

Re: Not able to delete the NiFi existing flow usin...

Re: Securing Nifi with SSL and using OIDC provider...

Re: External zookeeper and nifi cluster connection...

Re: Is it necessary to install Kerberos inorder to...

Re: sqoop import/export error Connecting to Resour...

Re: HDP Ambari installation throws "org.postgresql...

Re: HDP Sandbox: New tables and databases not visi...

Re: Kafka audit Logs stored in HDFS

Re: From Apache Ranger UI not able to connect to H...

Re: what is the future about next HDP versions

Re: ThriftTransport Error

Re: Not able to connect to hiveserver2 to access ...

Re: CSV file with Duplicate Headers