Member since
01-24-2017
12
Posts
3
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2313 | 07-20-2017 08:26 AM | |
4429 | 05-23-2017 07:38 AM |
11-09-2017
11:48 AM
Folks, My production CDH cluster is configured to use data disks. However, at times something pounds the root volume ON data nodes. - Is there any CDH service that writes vigirious to the root volume? - Could it be a CM agent? Thanking in advance
... View more
Labels:
09-19-2017
08:35 AM
I find the issue occurs when the table is written and then read with minimal time in between. A workaround I used was to put a sleep for some seconds.
... View more
09-06-2017
07:42 AM
Thanks everyone for their input. 🙂 My peer, Rohit Narra, suggested that we *may* get "WARNINGS: Unknown disk id" when the stats on the paritions are NOT computed incrementally, and is reproduable even if we limit to 10 rows. However, when we compute stats incrementally, this error goes away. Stragne as it may seem, I can confirm I was able to reproduce the problem and the solution as shared below -- I masked some info for anonymity. It is a scary but really non-specific warning -- could be anything from catalogd unable to communicate impalad to some host problems. Cloudera might as well say that "something may be wrong" 😛 Anyways, In this case, the warning was a false alarm. @Bharathv, thanks for confirming the warning has improved with CDH 5.12.x. = process = I simplified the query to generate the "WARNINGS: Unknown disk id" while querying a single partition. [prodserver.ca:21000] > select dt_skey from serv_video_esd.cdn_integrated where dt_skey=20170824 and org_nm='media' and service='media' and cdn_nm='nakami' limit 3; Query: select dt_skey from serv_video_esd.cdn_integrated where dt_skey=20170824 and org_nm='media' and service='media' and cdn_nm='nakami' limit 3 +----------+ | dt_skey | +----------+ | 20170824 | | 20170824 | | 20170824 | +----------+ WARNINGS: Unknown disk id. This will negatively affect performance. Check your hdfs settings to enable block location metadata. The table is partitioned on four attributes: | PARTITIONED BY ( | dt_skey INT, | org_nm STRING, | service STRING, | cdn_nm STRING Notice, the stats is "true" on this partition: | dt_skey | org_nm | service | cdn_nm | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | Incremental stats | Location | 20170824 | media | media | nakami | 257006421 | 1549 | 12.91GB | NOT CACHED | NOT CACHED | PARQUET | true | hdfs://nameservice1/user/hive/warehouse/serv_video_esd.db/cdn_integrated/dt_skey=20170824/org_nm=media/service=media/cdn_nm=nakami I recomputed stats anyway. COMPUTE INCREMENTAL STATS serv_video_esd.cdn_integrated PARTITION (dt_skey=20170824,org_nm='media',service='media',cdn_nm='nakami'); Query: compute INCREMENTAL STATS serv_video_esd.cdn_integrated PARTITION (dt_skey=20170824,org_nm='media',service='media',cdn_nm='nakami') +------------------------------------------+ | summary | +------------------------------------------+ | Updated 1 partition(s) and 26 column(s). | +------------------------------------------+ Fetched 1 row(s) in 27.21s [prodserver.ca:21000] > select dt_skey from serv_video_esd.cdn_integrated where dt_skey=20170824 and org_nm='media' and service='media' and cdn_nm='nakami' limit 3; Query: select dt_skey from serv_video_esd.cdn_integrated where dt_skey=20170824 and org_nm='media' and service='media' and cdn_nm='nakami' limit 3 +----------+ | dt_skey | +----------+ | 20170824 | | 20170824 | | 20170824 | +----------+ Fetched 3 row(s) in 0.12s <================================ "WARNINGS: Unknown disk id." disappers!!!
... View more
08-30-2017
07:50 AM
Thanks everyone for their replies. @Tim HDFS Rebalance process finishes under a minute. Seems that data is balanced. @Bharathv re DN being busy. do you mean the data node daemon or machine? For the coordinator host, - JVM Heap Memory of DN process coordinator is 50% occupied. - CPU of DN process is close to zero. - HOST CPU is around 30%. uptime is low (as 24 core machine) 10:47:49 up 364 days, 22:34, 1 user, load average: 3.38, 5.46, 5.20 - HOST Physical Memory Used is around 130gb out of 250gb I am inclined to rule out possibility of being busy.
... View more
08-29-2017
08:51 AM
1 Kudo
We perform "compute stats" fairly regularly but still get this message. Besides, don't seems a very relevant messaged for dated stats. 😛 Has this jira been resolved? https://issues.apache.org/jira/browse/IMPALA-1427 Need to confirm if this not a "rogue" message. We have impala daemon versions. [root@hostname~]# impalad --version impalad version 2.5.0-cdh5.7.2 RELEASE (build 1140f8289dc0d2b1517bcf70454bb4575eb8cc70) Built on Fri, 22 Jul 2016 12:30:57 PST [root@hostname~]# catalogd --version catalogd version 2.5.0-cdh5.7.2 RELEASE (build 1140f8289dc0d2b1517bcf70454bb4575eb8cc70) Built on Fri, 22 Jul 2016 12:30:57 PST
... View more
07-20-2017
08:26 AM
Figured out. Need both: - username/password to connect to cm api - user launching script to have valid kerberos ticket (surprising find) Thanks everyone. 🙂
... View more
07-20-2017
06:56 AM
Hi Friends, Getting the following exception: cm_api.api_client.ApiException: HTTP Error 401: basic auth failed (error 401) CM is on the same host where I am executing the python script. Seems a very basic error. Passing admin CM credentials. Seeking pointers to - which port is it using? - do I have to enable CM REST API? - anyothers? Complete stack trace. [rizwmian@w0575oslshcea01 cluster_diff]$ python --version Python 2.6.6 [rizwmian@w0575oslshcea01 cluster_diff]$ cdh.py Traceback (most recent call last): File "./cdh.py", line 67, in <module> main() File "./cdh.py", line 62, in main cluster = find_cluster(api, None) File "./cdh.py", line 12, in find_cluster all_clusters = api.get_all_clusters() File "/usr/lib/python2.6/site-packages/cm_api/api_client.py", line 128, in get_all_clusters return clusters.get_all_clusters(self, view) File "/usr/lib/python2.6/site-packages/cm_api/endpoints/clusters.py", line 66, in get_all_clusters params=view and dict(view=view) or None) File "/usr/lib/python2.6/site-packages/cm_api/endpoints/types.py", line 139, in call ret = method(path, params=params) File "/usr/lib/python2.6/site-packages/cm_api/resource.py", line 110, in get return self.invoke("GET", relpath, params) File "/usr/lib/python2.6/site-packages/cm_api/resource.py", line 73, in invoke headers=headers) File "/usr/lib/python2.6/site-packages/cm_api/http_client.py", line 174, in execute raise self._exc_class(ex) cm_api.api_client.ApiException: HTTP Error 401: basic auth failed (error 401) cm_api.api_client.ApiException: HTTP Error 401: basic auth failed (error 401)
... View more
Labels:
06-27-2017
11:11 AM
Folks, I am trying to import a single oolumn from Vertica (v8.1.0-2) into HDFS or Hive using Sqoop but unable. I get the following exception: Caused by: java.sql.SQLFeatureNotSupportedException: [Vertica][JDBC](10220) Driver not capable. at com.vertica.exceptions.ExceptionConverter.toSQLException(Unknown Source) at com.vertica.jdbc.common.SForwardResultSet.getBlob(Unknown Source) at org.apache.sqoop.lib.LargeObjectLoader.readBlobRef(LargeObjectLoader.java:238) The column consists of a json object: Data_type Type_name Column_size Buffer_length -4 Long Varbinary 130000 130000 Questions: 1. Wonder if vertica-jdbc-8.1.0-3.jar support getBlob()? It does when I disasembled the class. javap SForwardResultSet.class | grep -i getblob public java.sql.Blob getBlob(int) throws java.sql.SQLException; public java.sql.Blob getBlob(java.lang.String) throws java.sql.SQLException; 2. What is the parameter passed to the getBlob(). I looked at the actual java code generated by Sqoop. public void loadLargeObjects(LargeObjectLoader __loader) throws SQLException, IOException, InterruptedException { this.___raw__ = __loader.readBlobRef(1, this.__cur_result_set); } where __cur_result_set is a (non-string) ResultSet. 3. what could be a work around to import the column? Other things I have tried: - able to "select <column>" on the Vertica table using Sqoop eval with no problems - able to import another Vertica table in HDFS using Sqoop - able to view the column with DBVisualizer with the Vertica JDBC drivers Thanking in advance Cross posted on Cloudera forms, Stack Exchange and Vertica Forums
... View more
05-23-2017
07:38 AM
2 Kudos
Thanks everyone for their input. I have done some research on the topic and share my findings. 1. any static number is a magic number. I propose the number of block threshold to be: heap memory (in gb) x 1 million * comfort_%age (say 50%) Why? Rule of thumb: 1gb for 1M blocks, Cloudera [1] The actual amount of heap memory required by namenode turns out to be much lower. Heap needed = (number of blocks + inode (files + folders)) x object size (150-300 bytes [1,2]) For 1 million *small* files: heap needed = (1M + 1M) x 300b = 572mb <== much smaller than rule of thumb. 2. High block count may indicate both. namenode UI states the heap capacity used. For example, http://namenode:50070/dfshealth.html#tab-overview 9,847,555 files and directories, 6,827,152 blocks = 16,674,707 total filesystem object(s). Heap Memory used 5.82 GB of 15.85 GB Heap Memory. Max Heap Memory is 15.85 GB. ** Note, the heap memory used is still higher than 16,674,707 objects x 300 bytes = 4.65gb To find out small files, do hdfs fsck <path> -blocks | grep "Total blocks (validated):" It would return something like: Total blocks (validated): 2402 (avg. block size 325594 B) <== which is smaller than 1mb 3. yes. a file is small if its size < dfs.blocksize. 4. * each file takes a new data block on disk, though the block size is close to file size. so small block. * for every new file, inode type object is created (150B), so stress on heap memory of name node Small files pose problems for both name node and data nodes: name nodes: - Pull the ceiling on number of files down as it needs to keep metadata for each file in memory - Long time in restarting as it must read the metadata of every file from a cache on local disk data nodes: - large number of small files means a large amount of random disk IO. HDFS is designed for large files, and benefits from sequential reads. [1] https://www.cloudera.com/documentation/enterprise/5-8-x/topics/admin_nn_memory_config.html [2] https://martin.atlassian.net/wiki/pages/viewpage.action?pageId=26148906
... View more
01-25-2017
07:50 AM
1. A threshold of 500,000 or 1M seems like a "magic" number. Shouldn't it be a function of memory of node (Java Heap Size of DataNode in Bytes)? Other interesting related questions: 2. What does a high block count indicate? a. too many small files? b. running out of capacity? is it (a) or (b)? how to differentiate between the two? 3. What is a small file? A file whose size is smaller than block size (dfs.blocksize)? 4. Does each file take a new data block on disk? or is it the meta data associated with new file that is the problem? 5. The effects are more GC, declising execution speeds etc. How to "quantify" the effects of high block count?
... View more
Labels:
01-24-2017
09:28 AM
Users are getting *non-deterministic* "Memory limit exceeded" for impala queries. Impala Daemon Memory Limit: 100 GB. spill to disk is enabled However, a query failed with the above memory with Aggregate Peak Memory Usage: 125 MiB Explored query profile by CM -> Impala -> Queries -> {failed query "oom=true AND stats_missing=false"} Want help in narrowing down the cause of the failure: inaccurate stats, congestion, hdfs disk rebalancing, or something else? Where can I find the detailed of the failure? /var/log/impalad and catalogd state the "Query ID" but not the failure details. For example, impala logs stated the Query ID only: a24fb2eae077b513:45c8d35936a35e9e impalad.w0575oslphcda11.bell.corp.bce.ca.impala.log.INFO.20170124-031735.13521:I0124 03:57:06.889070 44232 plan-fragment-executor.cc:92] Prepare(): query_id=a24fb2eae077b513:45c8d35936a35e93 instance_id=a24fb2eae077b513:45c8d35936a35e9e
... View more