About Bildervic

Bildervic · ‎12-16-2021

hello Becky You can either reduce "split max size" in order to gain more mappers SET mapreduce.input.fileinputformat.split.maxsize; Or you can try : Set mapreduce.job.maps=XX for the second option, you may need to disable map files hive.merge.mapfiles=false Let me know if any solution works for you Good luck

Bildervic · ‎01-18-2021

hello, try to find your log4j.properties file ( my case /etc/hadoop/conf.cloudera.hdfs/log4j.properties) and add these two lines : log4j.appender.RFA=org.apache.log4j.ConsoleAppender log4j.appender.RFA.layout=org.apache.log4j.PatternLayout good luck

Bildervic · ‎01-18-2021

hello Michael, I had a similar issue with my CDH bases cluster, solved by a stupid-like solution. What I did is that first I turned the replication factor into 2 instead of 3, /*under replicated blocks notice should appear */ run the rebalance (by Blockpool then by Datanode to make some shuffles between data nodes ) , then reconfigure the replication factor to 3, then I noticed some major changes. Not sure if that gonna work for you but just wanted to share my experience if want to try it. Good luck

Bildervic · ‎01-18-2021

hello, So to have a clearer vision of what you're seeking can you tell us if you're willing to connect your host to an already installed cluster or you just want to install Hadoop on a single machine (standalone )? On another side, your server needs repositories to install from the components (you have to configure your local repository on your server or any other server that can connect to it via a private IP for example )

Bildervic · ‎09-21-2020

I have tested the backup/restore solution and seems to be working like charm with spark : -First, check and record the names as given in the list of the kudu_master (or the primary elected master in case of multi masters ) http://Master1:8051/tables -Download the kudu-backupX.X.jar in case you can't find it in /opt/cloudera/parcels/CDH-X.Xcdh.XX/lib/ and put it there -In kuduMasterAddresses you put the name of your Kudu_master or the names of your three masters separated by ',' -Backup : sudo -u hdfs spark2-submit --class org.apache.kudu.backup.KuduBackup /opt/cloudera/parcels/CDH-X.Xcdh.XX/lib/kudu-backup2_2.11-1.13.0.jar --kuduMasterAddresses MASTER1(,MASTER2,..) --rootPath hdfs:///PATH_HDFS impala::DB.TABLE -COPY : sudo -u hdfs hadoop distcp -i - hdfs:///PATH_HDFS/DB.TABLE hdfs://XXX:8020/kudu_backups/ -Restore: sudo -u hdfs spark2-submit --class org.apache.kudu.backup.KuduRestore /opt/cloudera/parcels/CDH-X.Xcdh.XX/lib/kudu-backup2_2.11-1.13.0.jar --kuduMasterAddresses MASTER1(,MASTER2,..) --rootPath hdfs:///PATH_HDFS impala::DB.TABLE finally INVALIDATE METADATA

Bildervic · ‎09-19-2020

hi @Harish19 there is a solution I'm going to test mentioned in https://kudu.apache.org/docs/administration.html and https://docs.cloudera.com/cdp/latest/data-migration/topics/cdp-data-migration-restoring-kudu-data.html the main idea is to create a backup with spark move it with distcp then restore your backup good luck

Bildervic · ‎03-24-2020

HI Rosa, Sorry I gave up that time because It was an urgent matter , so I just took the short way and used hive to hash my data and put it in a table , where I can run my queries later with Impala , I'll come back for it later for sure since hive is a bit slow while having java based functions. I'd recommend you to try with C language ,it's suitable for impala tho it will work faster . So please If you came up with anything share it with us , otherwise I'll post for sure my solution once it's done . best luck Bilal

Bildervic · ‎01-30-2020

After some researches I did, it seems that Impala does not support GenericUDFs yet. https://issues.apache.org/jira/browse/IMPALA-7877 https://issues.apache.org/jira/browse/IMPALA-8369 so I'll just try to create my own function for Impala.

Bildervic · ‎01-29-2020

Hi all, I'm trying to create a function to use in imapla. my function is simply re-using hive's sha2() function. the creation of the function goes smoothly : create function to_sha2(string,int) returns string location 'user/hive/hive.jar' symbol='org.apache.hadoop.hive.ql.udf.generic.GenericUDFSha2' ; but when I try to use it doesn't work launching this warning : select to_sha2('test',256); Query State: EXCEPTION Query Status: ClassCastException: class org.apache.hadoop.hive.ql.udf.generic.GenericUDFSha2 I have tried to search for UDFSha2 that doesn't contain the Generic word in the hive's jar but I couldn't find it. the original built-in function in hive : sha2(string/binary, len) - Calculates the SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, and SHA-512). | the other functions are working normally in impala (for example I have tried to create UDF MD5 function from hive's jar and it was working ). so my question is do I have to create my own UDF-Sha2 function? or there is a saving situation for my case, any help will be appreciated Impala version 2.9 Hive : 1.1.0 CDH : 5.12

Bildervic · ‎01-23-2020

works perfectly , thanks

Online	Offline
Last Visited	‎07-22-2024 11:18 AM

Member Since	‎01-31-2019 01:50 AM
Last Visited	‎07-22-2024 11:18 AM
Posts	26
Kudos received	6

Cloudera Community

Re: IMPALA function 'hive generic UDF SHA2 '

Re: Kudu to HDFS data load timestamp issue.

Re: Which Host Should HDFS "Balancer" Role Reside?

Re: Failed to start role -YARN- NodeManager (n...

Re: Number of mapper

Re: getting ERROR when trying to start hadoop name...

Re: hadoop + how to rebalnce the hdfs

Re: install the Hadoop client components on a host...

Re: kudu tables migration from one cluster to anot...

Re: kudu tables migration from one cluster to anot...

Re: IMPALA function 'hive generic UDF SHA2 '

Re: IMPALA function 'hive generic UDF SHA2 '

IMPALA function 'hive generic UDF SHA2 '

Re: Failed to configure urlScheme property for Sol...