Member since
12-11-2015
244
Posts
31
Kudos Received
32
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 338 | 07-22-2025 07:58 AM | |
| 946 | 01-02-2025 06:28 AM | |
| 1578 | 08-14-2024 06:24 AM | |
| 3115 | 10-02-2023 06:26 AM | |
| 2385 | 07-28-2023 06:28 AM |
02-19-2020
09:56 PM
you can set hive.mapred.mode = strict; Quoting from doc: https://blog.cloudera.com/improving-query-performance-using-partitioning-in-apache-hive/ If your partitioned table is very large, you could block any full table scan queries by putting Hive into strict mode using the set hive.mapred.mode=strict command. In this mode, when users submit a query that would result in a full table scan (i.e. queries without any partitioned columns) an error is issued.
... View more
02-19-2020
08:53 PM
1 Kudo
@JeffEvans You are right. In CDH we cherry pick jiras to be included in our spark. Not all features available in upstream are expected to be present on CDH spark. The line number you quoted was added in this jira https://issues.apache.org/jira/browse/SPARK-1087 and is not back-ported to our spark code base. This is one of the reason we quote the following in our documentation Although this document makes some references to the external Spark site, not all the features, components, recommendations, and so on are applicable to Spark when used on CDH. Always cross-check the Cloudera documentation before building a reliance on some aspect of Spark that might not be supported or recommended by Cloudera. Hope this clarifies.
... View more
02-19-2020
01:01 AM
ERROR 2020Feb19 02:01:21,086 main com.client.engineering.group.JOB.main.JOBMain: org.apache.hadoop.hbase.client.RetriesExhaustedException thrown: Can't get the location On this application which particular table are you trying to access? Did you validate if the user mcaf has permission to access the concerned table (https://docs.cloudera.com/documentation/enterprise/5-14-x/topics/cdh_sg_hbase_authorization.html#topic_8_3_2 has the commands) If there is no permission for the concerned user, grant them required privileges. If you notice privileges required for mcaf are already provided. Then checking hbase master logs during the issue timeframe would give further clues. Qn: And do we need to execute 'kinit mcaf' every time before submitting the job ? And how can we configure scheduled jobs ? Ans: Yes and how are you scheduling the jobs? If its a shell script then you can include kinit command with mcaf's keytab which would avoid prompting for password
... View more
02-18-2020
10:12 PM
yes, you are in right direction. You can set min.user.id to a value lower value like 500 and then re-submit the job
... View more
02-18-2020
07:42 PM
Actually we use mcaf as a user to execute the jobs but why http user coming to the picture ? --> By this do you mean, you switch to mcaf unix user[su - mcaf] and then run job? If yes, then its wrong. Post enabling kerberos hdfs and yarn recognises the user by the tgt and not by unix user id. So even if you su to mcaf and then have tgt as different user[say HTTP]. then yarn/hdfs recognises you by that tgt user. Can you kinit mcaf, then run klist[to ensure you have mcaf tgt] and submit the job?
... View more
02-17-2020
11:55 PM
The klist result shows you are submitting job as HTTP user hostname.org:~:HADOOP QA]$ klist Ticket cache: FILE:/tmp/krb5cc_251473 Default principal: HTTP/hostname.org@FQDN.COM WARN security.UserGroupInformation: PriviledgedActionException as:HTTP/hostname.org@FQDN.COM (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: Permission denied: user=HTTP, access=WRITE, inode="/user":mcaf:supergroup:drwxr-xr-x The above error just implies you don't have write permission for HTTP user on /user directory. So you can either provide write permission for "others" for /user in hdfs so that HTTP user can write or run the job after you kinit as user mcaf which has write permission
... View more
02-11-2020
02:30 AM
You will need to further isolate the issue to understand the root cause. There are 4 tables involved rmt_demo.resume_convert, rmt_demo.job_description_convert, rmt_demo.skill_count and rmt_demo.education. Are you noticing results when you do select each of these tables individually? If yes, after doing join you are not noticing result then it implies there are no rows matching the join criteria. If you are not able to retrieve results from none of the tables. Then you would need to inspect the table location. To get table location you can run describe formatted <table_name> and then run hdfs dfs -ls <table_location> to understand if there are any data underneath it
... View more
02-10-2020
10:49 PM
What version of CDH/HDP are you trying this on? Per the query you shared you are running a insert query and for insert you wont see any results on the console? Are you trying to run another select query after this insert query which is not showing results? What results do you get for this query SELECT n.Id,
t.job_id,
t.job_title,
n.Name,
n.Email,
n.Mobile_Number,
n.Education,
n.Total_Experiance,
n.project_id,
((count(n.new_skills)*100)/s.skill_count) Average
FROM
rmt_demo.resume_convert n
JOIN
rmt_demo.job_description_convert t ON n.new_skills = t.skills and n.job_position = t.job_title
JOIN
rmt_demo.skill_count s ON n.job_position = s.job_title
JOIN
rmt_demo.education e ON n.education = e.education
GROUP BY
n.Id,t.job_id,t.job_title, n.Name, n.Email, n.Mobile_Number, n.Education, n.Total_Experiance,n.project_id,s.skill_count;
... View more
02-10-2020
06:45 AM
1 Kudo
You can pass that as command line argument Example: hbase org.apache.hadoop.hbase.mapreduce.RowCounter -Dmapreduce.job.cache.files=/test 'hbase_table_t10'
hbase org.apache.hadoop.hbase.mapreduce.RowCounter -Dmapreduce.job.cache.files=/test '<table_name>'
... View more
02-09-2020
07:19 PM
Sorry, I've not come across any scripts yet. For observability the cluster utilisation report is something that you can review to understand how weightage influenced the load. More details are in this link https://docs.cloudera.com/documentation/enterprise/5-14-x/topics/admin_cluster_util_report.html#concept_edr_ntt_2v
... View more