Member since
01-19-2017
3679
Posts
632
Kudos Received
372
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 869 | 06-04-2025 11:36 PM | |
| 1443 | 03-23-2025 05:23 AM | |
| 720 | 03-17-2025 10:18 AM | |
| 2595 | 03-05-2025 01:34 PM | |
| 1718 | 03-03-2025 01:09 PM |
04-20-2018
08:02 AM
@Christian Lunesa Yes store images in binary format see below but retrieval is another process altogether Create table image beeline> ! connect jdbc:hive2://texas.us.com:10000/default
Enter username for jdbc:hive2://texas.us.com:10000/default: hive
Enter password for jdbc:hive2://texas.us.com:10000/default: ****
Connected to: Apache Hive (version 1.2.1000.2.6.2.0-205)
Driver: Hive JDBC (version 1.2.1000.2.6.2.0-205)
Transaction isolation: TRANSACTION_REPEATABLE_READ
1: jdbc:hive2://texas.us.com:10000/default> show databases;
+----------------+--+
| database_name |
+----------------+--+
| default |
| geolocation |
+----------------+--+
4 rows selected (2.397 seconds)
use geolocation;
Create table image(picture binary);
show tables; Now to load image in it is as simple as the load data statement as: hive> show databases;
OK
default
geolocation
Time taken: 1.955 seconds, Fetched: 4 row(s)
hive> use geolocation;
hive> load data local inpath '/tmp/photo.jpg' into table image; Now check the image hive> select count(*) from image;
Query ID = geolocation_20180420094947_79e8e1fb-dfb3-40c6-949e-3fb61e8bc7d1
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1524208851011_0003)
--------------------------------------------------------------------------------
VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--------------------------------------------------------------------------------
Map 1 .......... SUCCEEDED 1 1 0 0 0 0
Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0
--------------------------------------------------------------------------------
VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 5.87 s
--------------------------------------------------------------------------------
OK
19038
Time taken: 10.114 seconds, Fetched: 1 row(s) A select will return gabbled output, but the is loaded. Store images/videos into Hadoop HDFS hdfs dfs -put /src_image_file /dst_image_file And if your intent is more than just storing the files, you might find HIPI useful. HIPI is a library for Hadoop's MapReduce framework that provides an API for performing image processing tasks in a distributed computing environment http://hipi.cs.virginia.edu/ http://www.tothenew.com/blog/how-to-manage-and-analyze-video-data-using-hadoop/ https://content.pivotal.io/blog/using-hadoop-mapreduce-for-distributed-video-transcoding Hope that helps
... View more
04-19-2018
07:55 AM
@venkata ramireddy DistCp (distributed copy) is a tool used for large inter/intra-cluster copying. It uses MapReduce to effect its distribution, error handling and recovery, and reporting. It expands a list of files and directories into input to map tasks, each of which will copy a partition of the files specified in the source list. Copying Data from Cluster1 to Cluster2 hadoop distcp hdfs://cluster1:8020/data/in/hdfs/ hdfs://cluster2:8020/new/path/in/hdfs/ Copying between 2 HA clusters Using distcp between two HA clusters would be to identify the current active NameNode and run distcp like you would with two clusters without HA: hadoop distcp hdfs://active1:8020/path hdfs://active2:8020/path Here is a documentation from Apache
... View more
04-19-2018
07:31 AM
@Praveen Patel Look at this @Sonu Sahi response in HCC https://community.hortonworks.com/answers/105850/view.html that could be a solution
... View more
04-18-2018
08:36 PM
@Christian Lunesa Unfortunately no but setting the attribute --map-column-hive Date=Timestamp to Sqoop will definitely work
... View more
04-17-2018
06:54 PM
1 Kudo
@Kumar Veerappan Unfortunately YES your cluster won't function you will have to shutdown gracefully your cluster and wait for the patching to end. In an HDP HA setup the master nodes NN, RM should be on 2 distinct racks/switches Here are some considerations
Machines should be on an isolated network from the rest of the data center. This means that no other applications or nodes should share network I/O with the Hadoop infrastructure. This is recommended as Hadoop is I/O intensive, and all other interference should be removed for a performant cluster. Machines should have static IPs. This will enable stability in the network configuration. If the network were configured with dynamic IPs, on a machine reboot or if the DNS lease were to expire then the machine’s IP address would change, and this would cause the Hadoop services to malfunction. Reverse DNS should be set up. Reverse DNS ensures that a node’s hostname can be looked up through the IP address. Certain Hadoop functionalities utilize and require reverse DNS. Dedicated “Top of Rack” (TOR) switches to Hadoop Use dedicated core switching blades or switches Ensure application servers are “close” to Hadoop Consider Ethernet bonding for increased capacity All clients and cluster nodes require Network access and open firewall ports each of the services for communication between the servers. If deployed to a cloud environment, then make certain all Hadoop cluster Master and Data nodes are on the same network zone (this is especially important when utilizing cloud services such as AWS and Azure). If deployed to a physical environment, then make certain to place the cluster on in a VLAN. The Data node and Client nodes should at the minimum have a 2 x 1 Gb Ethernet a typically recommended Network controller is 1 x 10 Gb Ethernet. For the switch communicating between the racks, you will want to establish the fastest Ethernet connections possible with the most capacity. Hope that helps
... View more
04-16-2018
09:21 PM
@Liana Napalkova Good to know happy hadooping...
... View more
04-16-2018
08:37 PM
@Liana Napalkova Is "eureambarislave1.local.eurecat.org" the valid hostname for your MySQL database server java.sql.SQLException: Access denied for user 'hive'@'eureambarislave1.local.eurecat.org' (using password: YES) There was a typo error in my previous command it should be GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'FQDN_MYSQL_HOST' IDENTIFIED BY 'hivepwd' WITH GRANT OPTION; instead of GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'FQDN_MYSQL_HOST' IDENTIFIED BY 'hive' WITH GRANT OPTION; Run that a root # mysql -u root -pwelcome1
mysql> use hivedb;
mysql> GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'FQDN_MYSQL_HOST' IDENTIFIED BY 'hivepwd' WITH GRANT OPTION;
mysql> Flush privileges; That should correct the issue of denied access Please revert !
... View more
04-16-2018
02:51 PM
@Liana Napalkova You are trying to add have service without a database to host the catalog. Can you access mysql database as root? if so proceed with the below steps if not execute /usr/bin/mysql_secure_installation and provide the inputs for the prompts Here I am assuming the root password is welcome1 and the hive password id hivepwd mysql -u root -pwelcome1
CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hivepwd';
GRANT ALL PRIVILEGES ON *.* TO 'hive'@'localhost';
GRANT ALL PRIVILEGES ON *.* TO 'hive'@'%';
GRANT ALL PRIVILEGES ON *.* TO 'hive'@'FQDN_MYSQL_HOST';
GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'localhost' IDENTIFIED BY 'hivepwd' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'FQDN_MYSQL_HOST' IDENTIFIED BY 'hive' WITH GRANT OPTION;
FLUSH PRIVILEGES; quit; # The as hive user created above create the Hive database. mysql -u hive -phivepwd
CREATE DATABASE hive;
quit; Now run the Ambari hive setup with the above credentials
... View more
04-16-2018
01:39 PM
@Liana Napalkova Did you install MySQL manually? When you install HDP mysql is not automatically installed. You will need torun yum install -y mysql-server Then also the connector # yum install -y mysql-connector-java* Remember to make MySQL autostart on boot with # chkconfig mysqld on The login to MySQL and create the hive database and user
... View more
04-16-2018
12:42 PM
@ASIF Khan If the cluster is managed by Ambari, this should be added in Ambari > HDFS > Configurations>Advanced core-site > Add Property hadoop.http.staticuser.user=yarn Please let me know if that worked
... View more