Member since
01-16-2018
613
Posts
48
Kudos Received
109
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
275 | 04-08-2025 06:48 AM | |
429 | 04-01-2025 07:20 AM | |
348 | 04-01-2025 07:15 AM | |
758 | 05-06-2024 06:09 AM | |
1133 | 05-06-2024 06:00 AM |
03-13-2021
11:57 PM
Hello @CaptainJa Thanks for your Update. Based on your review, the "hadoop-acl" enforcer is being delayed to be tracked via Ranger Audit UI while other Audits are likely appearing immediately. As far as I know, the Audit Framework from any Service to Solr is same, likely indicating the suspicions raised by you i.e. the "hadoop-acl" events are being buffered prior to being sent to Solr for Indexing. Currently, I am unfamiliar with any Configuration controlling the same yet wish to confirm if the HDFS Audit Logs or InfraSolr Logs are reporting any issues, which may point to any concerns. I was under the impression that Solr may be the Bottleneck for RangerAudit Lagging yet the synopsis appears to be impacting the "hadoop-acl" alone. - Smarak
... View more
02-11-2021
04:43 AM
Hi @Aco Yes, you can create it manually. Check the zookeeper document on how to create those directories. It happens due to insufficient permissions. You need to create the directories from zookeeper cli and set appropriate permission, rest of the data and file creation will be taken care by zookeeper. I will see if I could find the exact steps to share it with you
... View more
02-09-2021
08:16 PM
@smdas Yes we are checking the size of the table within the table level directory. hdfs dfs -du -h -s /apps/hbase/data/mobdir/data/OBST/DOCUMENT_CONTENT/b8e8a5fee4eXX/cfDocContent. This is the table structure. create 'MOB:TEST', {NAME => 'cfDocContent', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'FAST_DIFF', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE=> 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', IS_MOB => 'true', COMPRESSION => 'SNAPPY', BLOCKCACHE=> 'true', BLOCKSIZE => '65536'} , {NAME => 'cfMetadata', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE=> 'false', DATA_BLOCK_ENCODING => 'FAST_DIFF', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'SNAPPY', BLOCKCACHE => 'true', BLOCKSIZE => '65536'} Satya
... View more
02-03-2021
04:53 PM
One thing I noticed today in case it may help with this issue... Today I tried the sqoop from MSQL -> Hbase again on a new table with compression set and pre-split in Cloudera 5.15.1 and Cloudera 6.2.1 environments, Hbase configuration (and HDFS configuration for that matter) is almost identical. In the Cloudera 6.2.1 (ie. Hbase 2.1.2) environment I see the flush to the HStoreFile happen fairly quickly (only about 32,000 new entries) and in the logs it mentions 'Adding a new HDFS file' of size 321Kb. In the Cloudera 5.15.1 (ie. Hbase 1.2.x) environment I see the flush happen to the HStoreFile take longer and there are 700,000 entries being flush and the 'Adding a new HDFS file' is of size 6.5Mb. The memstore flush size is set to 128Mb in both environments and region servers have 24Gb available. So I think it's hitting the 0.4 heap factor for memstores and then it flushes in both cases. Also there are only a few tables with heavy writes so most of the other tables are fairly idle. So I don't think they would take up much memstore space. In the Cloudera 6.2.1 environment each server holds about 240 regions. In the Cloudera 5.15.1 environment each server holds about 120 regions. My thinking is that if I can get the Cloudera 6.2.1/hbase 2.1.2 memstore flush happening with a similar size and number of entries as the Cloudera 5.15.1 environment the performance issue for large writes would be solved. Just not sure how to make that happen. I also noticed that minor compactions happen in both environments take a similar amount of time so I think that's not an issue. Richard
... View more
02-03-2021
08:41 AM
The disk space occupied by a deleted row is only reclaimable via compaction and given you have deleted some data and if the space is not reclaimed then probably you are hitting the bug https://issues.apache.org/jira/browse/KUDU-1625 The jira stands unresolved. However if the goal is to delete the data and reclaim disk space, then you can drop partition (if range partition) in order to reclaim space. Tombstone tablets have all their data removed from disk and don't consume significant resources. These tablet are necessary for correct operation of kudu. See - https://docs.cloudera.com/runtime/7.1.0/troubleshooting-kudu/topics/kudu-tombstoned-or-stopped-tablet-replicas.html
... View more
02-02-2021
07:26 PM
Hey @smdas , thanks for your feedback, i will rephrase it for better understanding. Hive table name =hive_t1 External hive hbase table name= hb_t2 They both have identical data as per now 100 rows each Senario 1: Select t1.c1,t1.c2 from hive_t1 t1 left join(select KEY, c1 from hb_t2 where KEY like 'abc%') t2on (t1.c1=t2.c1); Here the values which i get from hbase external table are null , but expected results should be having matching values. When i was analysing found these results also. Senario 2: select KEY, c1 from hb_t2 where KEY like 'abc%'; => returns 0 rows or no result But if i run this select * from hb_t2 where KEY like 'abc%'; => Then i am able to see the data all the 100 rows, not able to understand this behaviour. -Shivam
... View more
02-02-2021
12:05 PM
Hello @rajatsachan Thanks for sharing the details into the Steps used by you to resolve the issue. This would definitely assist fellow Community Members facing similar issues. If you have no further concerns, Kindly mark the Post as Solved as well. Thanks, Smarak
... View more
01-27-2021
10:14 PM
1 Kudo
Hello @snm1523 As the Post was resolved, I am marking the Post as Solved. In future, Kindly mark the Post as Solved to ensure other Community Users can reference the Post for similar issues. Thanks for using Cloudera Community. - Smarak
... View more
01-18-2021
08:50 AM
Solved. Thank you
... View more
01-10-2021
10:22 PM
Hello @ShamsN Kindly update the Post, if you have solved the issue. If you continue to face the issue, Let us know & we can assist you. We requested additional details based on your Post on 12/16. - Smarak
... View more