Created 05-20-2024 05:35 AM
I created a table and in beeline (hive) and it worked quickly.
# Movies table
CREATE EXTERNAL TABLE movies (
movieId INT,
title STRING,
genres STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
"separatorChar" = ",",
"quoteChar" = "\"",
"escapeChar" = "\\"
)
STORED AS TEXTFILE
LOCATION '/user/hive/warehouse/movielens/movies'
TBLPROPERTIES ("skip.header.line.count"="1");
# Ratings table
CREATE EXTERNAL TABLE ratings (
userId INT,
movieId INT,
rating DOUBLE,
rating_timestamp BIGINT
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
"separatorChar" = ",",
"quoteChar" = "\"",
"escapeChar" = "\\"
)
STORED AS TEXTFILE
LOCATION '/user/hive/warehouse/movielens/ratings'
TBLPROPERTIES ("skip.header.line.count"="1");
I am attempting:
CREATE TABLE avg_movie_ratings AS
SELECT movieId, AVG(rating) AS avg_rating
FROM ratings
GROUP BY movieId;
which starts a map-reduce job, which is struck.
I have the hadoop and hive running.
However,
The url to track the job: http://anushkahp14:8088/proxy/application_1716189650320_0005/ returns ERR_CONNECTION_REFUSED.
Please help.
Created 05-20-2024 06:53 AM
@adsejnf, Welcome to Cloudera community!
Do you see any issues in the Hive logs?
Or try checking the application logs via CLI:
» yarn logs -applicationId <application ID> -appOwner <AppOwner>
Created on 05-22-2024 04:01 AM - edited 05-22-2024 04:05 AM
@tj2007, Thanks!
No logs were recorded,
anushkakundu@AnushkaHP14:~$ /opt/hadoop/bin/yarn logs -applicationId application_1716374810626_0001 -appOwner anushkakundu
2024-05-22 16:29:07,059 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at /0.0.0.0:8032
Can not find the logs for the application: application_1716374810626_0001 with the appOwner: anushkakundu
Kindly check my localhost:8088,
It says:
Log Aggregation Status:
NOT_START |
could it be an issue?
Created 05-28-2024 03:39 AM
The diagnostics message in YARN RM UI indicates that the application has been added to the scheduler but has not yet been activated. The message provides details about the reason for skipping the ApplicationMaster (AM) assignment. Let's break down the components of the message for a better understanding:
Application is added to the scheduler and is not yet activated.
Skipping AM assignment as cluster resource is empty.
Details:
AM Partition = <DEFAULT_PARTITION>;
AM Resource Request = <memory:2048, Cores:1>;
Queue Resource Limit for AM = <memory:0, vCores:0>;
User AM Resource Limit of the queue
The diagnostic message suggests that:
Cluster Resource Constraints:
Queue Configuration Issues:
User Resource Limits:
Check Cluster Resource Utilization:
Review Queue Configurations:
Inspect Application Logs:
Consult YARN ResourceManager Logs:
By understanding and addressing the issues highlighted in this diagnostic message, you can ensure that your YARN applications get the necessary resources to run effectively.
Created 05-31-2024 02:17 PM
@adsejnf Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.
Regards,
Diana Torres,