About athtsang

athtsang · ‎08-02-2017

CDH 5.10.1, CentOS 7.3 I need to change move the cluster to new servers and change the IP address (all nodes including CM). Is there any known procedure to do so? I thought of two possible methods: 1. Clone the servers, hunt for IP addresses in configuration file / SCM DB and update to new IP. 2. Reinstall CM on new server. Add and remove CDH nodes and let HDFS copy the data (our data volume is still small enough for this to be feasible) Any better suggestion?

athtsang · ‎07-02-2017

We tried CDH5.11 impala-udf-devel RPM and it seemed to work. Is it OK to use with 5.10? Or it's recommended to revert to 5.8 RPM?

athtsang · ‎06-29-2017

CentOS 7.3 (C++ compiler: gcc-c++-4.8.5-11.el7.x86_64)) CDH 5.10.1 installed with Parcels I am trying to compile the sample UDF , and getting the following error Linking CXX executable build/udf-sample-test /bin/ld: /opt/cloudera/parcels/CDH/lib64/libImpalaUdf-retail.a(udf.cc.o)(.text+0x3): unresolvable R_X86_64_NONE relocation against symbol `_ZNSs4_Rep20_S_empty_rep_storageE@@GLIBCXX_3.4' /bin/ld: final link failed: Nonrepresentable section on output collect2: error: ld returned 1 exit status make[2]: *** [build/udf-sample-test] Error 1 make[1]: *** [CMakeFiles/udf-sample-test.dir/all] Error 2 make: *** [all] Error 2 (I don't have libImpalaUdf.a. For my parcel based installation, I have libImpalaUdf-retail.a and libImpalaUdf-debug.a. If I use the debug one , the unresolvable symbol becomes `_ZSt4cerr@@GLIBCXX_3.4' ) In a question against CDH 5.9.0 , it was suggested to use an older UDF SDK for CDH 5.8.4 . This seesm a bit old now. Is there any pre-compiled libImpalaUdf-retail.a for CDH 5.10.1 which fixes this problem?

athtsang · ‎04-19-2017

CDH 5.2.0 and 5.10.1 It seems partition pruning is not working when using "or" and conditions with both partitioned and un-partitioned columns. Creates partitioned table with 3 partitions, 1 file for each partition. create table part_table ( strcol1 string, intcol1 int ) PARTITIONED BY ( keycol STRING ); insert into part_table partition (keycol='k1') values ('str1', 1); insert into part_table partition (keycol='k2') values ('str2', 2); insert into part_table partition (keycol='k3') values ('str3', 3); Partition pruning worked normally when no un-partitioned columns used explain select count(*) from part_table where ( (keycol='k1') or (keycol='k2' ) ) +-----------------------------------------------------+ | Explain String | +-----------------------------------------------------+ | Estimated Per-Host Requirements: Memory=0B VCores=0 | | | | PLAN-ROOT SINK | | | | | 01:AGGREGATE [FINALIZE] | | | output: count(*) | | | | | 00:SCAN HDFS [cdragg.part_table] | | partitions=2/3 files=2 size=14B | +-----------------------------------------------------+ When un-partitioned columns used in conditions, no partition pruning performed. explain select count(*) from part_table where ( (keycol='k0' and intcol1=0) or (keycol='k2' and intcol1=2) ) +-------------------------------------------------------------------------------------+ | Explain String | +-------------------------------------------------------------------------------------+ | Estimated Per-Host Requirements: Memory=0B VCores=0 | | | | PLAN-ROOT SINK | | | | | 01:AGGREGATE [FINALIZE] | | | output: count(*) | | | | | 00:SCAN HDFS [cdragg.part_table] | | partitions=3/3 files=3 size=21B | | predicates: ((keycol = 'k0' AND intcol1 = 0) OR (keycol = 'k2' AND intcol1 = 2)) | +-------------------------------------------------------------------------------------+ I can workaround the situation by another condition on partitioned columns. Is this the proper way to do? explain select count(*) from part_table where ( (keycol='k0' and intcol1=0) or (keycol='k2' and intcol1=2) ) and keycol in ('k0', 'k2') +-------------------------------------------------------------------------------------+ | Explain String | +-------------------------------------------------------------------------------------+ | Estimated Per-Host Requirements: Memory=0B VCores=0 | | | | PLAN-ROOT SINK | | | | | 01:AGGREGATE [FINALIZE] | | | output: count(*) | | | | | 00:SCAN HDFS [cdragg.part_table] | | partitions=1/3 files=1 size=7B | | predicates: ((keycol = 'k0' AND intcol1 = 0) OR (keycol = 'k2' AND intcol1 = 2)) | +-------------------------------------------------------------------------------------+

athtsang · ‎04-05-2017

The cause of my case was described in Message 4-5 of the thread. Here are some possible solutions set spark.local.dir to somewhere else outside /tmp . Refer to Spark Configuration for how to configure the value. disable housekeeping of /tmp/spark-... periodic restart your spark streaming job

athtsang · ‎11-01-2016

Answering my question... The source code of org.apache.hadoop.hdfs.server.blockmanagement.BlockManager says ... if (numCurrentReplica > expectedReplication) { if (num.replicasOnStaleNodes() > 0) { // If any of the replicas of this block are on nodes that are // considered "stale", then these replicas may in fact have // already been deleted. So, we cannot safely act on the // over-replication until a later point in time, when // the "stale" nodes have block reported. return MisReplicationResult.POSTPONE; } ... So the key point is whether the DataNodes are "stale". I don't know how to force the nodes to have block reported besides restarting. So I restarted all DataNode and over-replicated blocks gone.

athtsang · ‎11-01-2016

The setrep command just completed. However, the fsck still showing over-replication.

athtsang · ‎11-01-2016

CentOS 6.6 CDH 5.1.2 Due to space pressure, I need to reduce replication factor of existing files from 3 to 2. A command like the following is executed [hdfs]$ hdfs dfs -setrep -R -w 2 /path/of/files A warning about "waiting time may be long for DECREASING the number of replications" appeared. I am still waiting after tens of minutes. And fsck still showing over-replication. [hdfs]$ hdfs fsck /path/of/files 16/11/02 12:04:42 WARN ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' has not been set, no TrustStore will be loaded Connecting to namenode via http://namenode1:50070 FSCK started by hdfs (auth:SIMPLE) from /192.168.88.38 for path /path/of/files at Wed Nov 02 12:04:43 HKT 2016 ....Status: HEALTHY Total size: 129643323 B Total dirs: 1 Total files: 4 Total symlinks: 0 Total blocks (validated): 4 (avg. block size 32410830 B) Minimally replicated blocks: 4 (100.0 %) Over-replicated blocks: 4 (75.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 3 Corrupt blocks: 0 Missing replicas: 0 (0.0 %) Number of data-nodes: 6 Number of racks: 1 FSCK ended at Wed Nov 02 12:04:43 HKT 2016 in 1 milliseconds The filesystem under path ' /path/of/files' is HEALTHY Is this normal? How long should the wait be?

athtsang · ‎10-04-2016

CentOS 6.6 CDH 5.1.2 I would like to take down a DataNode temporary (say, for 24 hours). Some questions: For normally replicated blocks (target replication factor=3), can I disable HDFS to automatically re-replicate those blocks? For un-replicated blocks (replication factor=1), can I do anything to pre-relocate those blocks in case they are in the DataNode to be taken down? Understand I risk data loss. But those were not critical data anyway. Thanks.

athtsang · ‎05-30-2016

I can't found spark.local.dir in either Job History UI (got OutOfMemoryException and all job history gone after restart) or Application UI. However, according to documentation, spark.local.dir is /tmp by default, and the jar files are found in /tmp/spark-.../ . So the FileNotFoundException is likely caused by housekeeping /tmp.

Online	Offline
Last Visited	‎07-03-2019 10:01 PM

Member Since	‎11-03-2014 11:01 PM
Last Visited	‎07-03-2019 10:01 PM
Posts	46
Kudos received	8

Cloudera Community

Re: Fixing Over-replicated Blocks

Re: Isolation between Flume Channels?

Re: Problem Controlling CM Agent Start / Stop in a...

Re: Repairing a corrupt Cloudera Manager Installat...

Re: After server crash, HA Standby NameNode "Prema...

Changing IP Address of Cluster

Re: UDF Problem: unresolvable relocation of 'ZNSs4...

UDF Problem: unresolvable relocation of 'ZNSs4_Rep...

Question about Partition Pruning in Impala

Re: Spark Streaming: FileNotFoundException on file...

Re: Fixing Over-replicated Blocks

Re: Fixing Over-replicated Blocks

Fixing Over-replicated Blocks

Concern about Replication when Scheduled NameNode ...

Re: Spark Streaming: FileNotFoundException on file...