About Shelton

Shelton · ‎04-20-2018

@Christian Lunesa Yes store images in binary format see below but retrieval is another process altogether Create table image beeline> ! connect jdbc:hive2://texas.us.com:10000/default Enter username for jdbc:hive2://texas.us.com:10000/default: hive Enter password for jdbc:hive2://texas.us.com:10000/default: **** Connected to: Apache Hive (version 1.2.1000.2.6.2.0-205) Driver: Hive JDBC (version 1.2.1000.2.6.2.0-205) Transaction isolation: TRANSACTION_REPEATABLE_READ 1: jdbc:hive2://texas.us.com:10000/default> show databases; +----------------+--+ | database_name | +----------------+--+ | default | | geolocation | +----------------+--+ 4 rows selected (2.397 seconds) use geolocation; Create table image(picture binary); show tables; Now to load image in it is as simple as the load data statement as: hive> show databases; OK default geolocation Time taken: 1.955 seconds, Fetched: 4 row(s) hive> use geolocation; hive> load data local inpath '/tmp/photo.jpg' into table image; Now check the image hive> select count(*) from image; Query ID = geolocation_20180420094947_79e8e1fb-dfb3-40c6-949e-3fb61e8bc7d1 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1524208851011_0003) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 5.87 s -------------------------------------------------------------------------------- OK 19038 Time taken: 10.114 seconds, Fetched: 1 row(s) A select will return gabbled output, but the is loaded. Store images/videos into Hadoop HDFS hdfs dfs -put /src_image_file /dst_image_file And if your intent is more than just storing the files, you might find HIPI useful. HIPI is a library for Hadoop's MapReduce framework that provides an API for performing image processing tasks in a distributed computing environment http://hipi.cs.virginia.edu/ http://www.tothenew.com/blog/how-to-manage-and-analyze-video-data-using-hadoop/ https://content.pivotal.io/blog/using-hadoop-mapreduce-for-distributed-video-transcoding Hope that helps

Shelton · ‎04-19-2018

@venkata ramireddy DistCp (distributed copy) is a tool used for large inter/intra-cluster copying. It uses MapReduce to effect its distribution, error handling and recovery, and reporting. It expands a list of files and directories into input to map tasks, each of which will copy a partition of the files specified in the source list. Copying Data from Cluster1 to Cluster2 hadoop distcp hdfs://cluster1:8020/data/in/hdfs/ hdfs://cluster2:8020/new/path/in/hdfs/ Copying between 2 HA clusters Using distcp between two HA clusters would be to identify the current active NameNode and run distcp like you would with two clusters without HA: hadoop distcp hdfs://active1:8020/path hdfs://active2:8020/path Here is a documentation from Apache

Shelton · ‎04-19-2018

@Praveen Patel Look at this @Sonu Sahi response in HCC https://community.hortonworks.com/answers/105850/view.html that could be a solution

Shelton · ‎04-18-2018

@Christian Lunesa Unfortunately no but setting the attribute --map-column-hive Date=Timestamp to Sqoop will definitely work

Shelton · ‎04-17-2018

@Kumar Veerappan Unfortunately YES your cluster won't function you will have to shutdown gracefully your cluster and wait for the patching to end. In an HDP HA setup the master nodes NN, RM should be on 2 distinct racks/switches Here are some considerations Machines should be on an isolated network from the rest of the data center. This means that no other applications or nodes should share network I/O with the Hadoop infrastructure. This is recommended as Hadoop is I/O intensive, and all other interference should be removed for a performant cluster. Machines should have static IPs. This will enable stability in the network configuration. If the network were configured with dynamic IPs, on a machine reboot or if the DNS lease were to expire then the machine’s IP address would change, and this would cause the Hadoop services to malfunction. Reverse DNS should be set up. Reverse DNS ensures that a node’s hostname can be looked up through the IP address. Certain Hadoop functionalities utilize and require reverse DNS. Dedicated “Top of Rack” (TOR) switches to Hadoop Use dedicated core switching blades or switches Ensure application servers are “close” to Hadoop Consider Ethernet bonding for increased capacity All clients and cluster nodes require Network access and open firewall ports each of the services for communication between the servers. If deployed to a cloud environment, then make certain all Hadoop cluster Master and Data nodes are on the same network zone (this is especially important when utilizing cloud services such as AWS and Azure). If deployed to a physical environment, then make certain to place the cluster on in a VLAN. The Data node and Client nodes should at the minimum have a 2 x 1 Gb Ethernet a typically recommended Network controller is 1 x 10 Gb Ethernet. For the switch communicating between the racks, you will want to establish the fastest Ethernet connections possible with the most capacity. Hope that helps

Shelton · ‎04-16-2018

@Liana Napalkova Good to know happy hadooping...

Shelton · ‎04-16-2018

@Liana Napalkova Is "eureambarislave1.local.eurecat.org" the valid hostname for your MySQL database server java.sql.SQLException: Access denied for user 'hive'@'eureambarislave1.local.eurecat.org' (using password: YES) There was a typo error in my previous command it should be GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'FQDN_MYSQL_HOST' IDENTIFIED BY 'hivepwd' WITH GRANT OPTION; instead of GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'FQDN_MYSQL_HOST' IDENTIFIED BY 'hive' WITH GRANT OPTION; Run that a root # mysql -u root -pwelcome1 mysql> use hivedb; mysql> GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'FQDN_MYSQL_HOST' IDENTIFIED BY 'hivepwd' WITH GRANT OPTION; mysql> Flush privileges; That should correct the issue of denied access Please revert !

Shelton · ‎04-16-2018

@Liana Napalkova You are trying to add have service without a database to host the catalog. Can you access mysql database as root? if so proceed with the below steps if not execute /usr/bin/mysql_secure_installation and provide the inputs for the prompts Here I am assuming the root password is welcome1 and the hive password id hivepwd mysql -u root -pwelcome1 CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hivepwd'; GRANT ALL PRIVILEGES ON *.* TO 'hive'@'localhost'; GRANT ALL PRIVILEGES ON *.* TO 'hive'@'%'; GRANT ALL PRIVILEGES ON *.* TO 'hive'@'FQDN_MYSQL_HOST'; GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'localhost' IDENTIFIED BY 'hivepwd' WITH GRANT OPTION; GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'FQDN_MYSQL_HOST' IDENTIFIED BY 'hive' WITH GRANT OPTION; FLUSH PRIVILEGES; quit; # The as hive user created above create the Hive database. mysql -u hive -phivepwd CREATE DATABASE hive; quit; Now run the Ambari hive setup with the above credentials

Shelton · ‎04-16-2018

@Liana Napalkova Did you install MySQL manually? When you install HDP mysql is not automatically installed. You will need torun yum install -y mysql-server Then also the connector # yum install -y mysql-connector-java* Remember to make MySQL autostart on boot with # chkconfig mysqld on The login to MySQL and create the hive database and user

Shelton · ‎04-16-2018

@ASIF Khan If the cluster is managed by Ambari, this should be added in Ambari > HDFS > Configurations>Advanced core-site > Add Property hadoop.http.staticuser.user=yarn Please let me know if that worked

Online	Offline
Last Visited	‎12-11-2025 11:50 PM

Member Since	‎01-19-2017 04:35 AM
Last Visited	‎12-11-2025 11:50 PM
Posts	3,679
Kudos received	627

Cloudera Community

Re: Apache nifi memory consumption in kubernetes

Re: Nifi toolkit command for GitLabFlowRegistry

Re: Not able to delete the NiFi existing flow usin...

Re: Securing Nifi with SSL and using OIDC provider...

Re: External zookeeper and nifi cluster connection...

Re: Does hive support Photo or images datatypes?

Re: How to copy data between two hadoop clusters?

Re: insert overwrite table A select * from B; unde...

Re: why does importation from mssql hive does not ...

Re: Top of the Rack switch maintenance

Re: MySQL Server is installed but cannot be starte...

Re: MySQL Server is installed but cannot be starte...

Re: MySQL Server is installed but cannot be starte...

Re: MySQL Server is installed but cannot be starte...

Re: Username and Password for Resource Manager Web...