Member since
01-31-2019
26
Posts
7
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
7076 | 01-30-2020 08:10 AM | |
3605 | 08-02-2019 02:35 AM | |
1271 | 04-24-2019 10:07 AM | |
5688 | 04-24-2019 02:27 AM |
12-16-2021
05:53 AM
2 Kudos
hello Becky You can either reduce "split max size" in order to gain more mappers SET mapreduce.input.fileinputformat.split.maxsize; Or you can try : Set mapreduce.job.maps=XX for the second option, you may need to disable map files hive.merge.mapfiles=false Let me know if any solution works for you Good luck
... View more
01-18-2021
02:50 AM
hello, try to find your log4j.properties file ( my case /etc/hadoop/conf.cloudera.hdfs/log4j.properties) and add these two lines : log4j.appender.RFA=org.apache.log4j.ConsoleAppender log4j.appender.RFA.layout=org.apache.log4j.PatternLayout good luck
... View more
01-18-2021
02:33 AM
1 Kudo
hello Michael, I had a similar issue with my CDH bases cluster, solved by a stupid-like solution. What I did is that first I turned the replication factor into 2 instead of 3, /*under replicated blocks notice should appear */ run the rebalance (by Blockpool then by Datanode to make some shuffles between data nodes ) , then reconfigure the replication factor to 3, then I noticed some major changes. Not sure if that gonna work for you but just wanted to share my experience if want to try it. Good luck
... View more
01-18-2021
02:16 AM
hello, So to have a clearer vision of what you're seeking can you tell us if you're willing to connect your host to an already installed cluster or you just want to install Hadoop on a single machine (standalone )? On another side, your server needs repositories to install from the components (you have to configure your local repository on your server or any other server that can connect to it via a private IP for example )
... View more
09-21-2020
08:44 AM
I have tested the backup/restore solution and seems to be working like charm with spark :
-First, check and record the names as given in the list of the kudu_master (or the primary elected master in case of multi masters ) http://Master1:8051/tables
-Download the kudu-backupX.X.jar in case you can't find it in /opt/cloudera/parcels/CDH-X.Xcdh.XX/lib/ and put it there
-In kuduMasterAddresses you put the name of your Kudu_master or the names of your three masters separated by ','
-Backup : sudo -u hdfs spark2-submit --class org.apache.kudu.backup.KuduBackup /opt/cloudera/parcels/CDH-X.Xcdh.XX/lib/kudu-backup2_2.11-1.13.0.jar --kuduMasterAddresses MASTER1(,MASTER2,..) --rootPath hdfs:///PATH_HDFS impala::DB.TABLE
-COPY : sudo -u hdfs hadoop distcp -i - hdfs:///PATH_HDFS/DB.TABLE hdfs://XXX:8020/kudu_backups/ -Restore:
sudo -u hdfs spark2-submit --class org.apache.kudu.backup.KuduRestore /opt/cloudera/parcels/CDH-X.Xcdh.XX/lib/kudu-backup2_2.11-1.13.0.jar --kuduMasterAddresses MASTER1(,MASTER2,..) --rootPath hdfs:///PATH_HDFS impala::DB.TABLE finally INVALIDATE METADATA
... View more
09-19-2020
12:42 AM
hi @Harish19 there is a solution I'm going to test mentioned in https://kudu.apache.org/docs/administration.html and https://docs.cloudera.com/cdp/latest/data-migration/topics/cdp-data-migration-restoring-kudu-data.html the main idea is to create a backup with spark move it with distcp then restore your backup good luck
... View more
03-24-2020
07:17 AM
1 Kudo
HI Rosa, Sorry I gave up that time because It was an urgent matter , so I just took the short way and used hive to hash my data and put it in a table , where I can run my queries later with Impala , I'll come back for it later for sure since hive is a bit slow while having java based functions. I'd recommend you to try with C language ,it's suitable for impala tho it will work faster . So please If you came up with anything share it with us , otherwise I'll post for sure my solution once it's done . best luck Bilal
... View more
01-30-2020
08:10 AM
1 Kudo
After some researches I did, it seems that Impala does not support GenericUDFs yet. https://issues.apache.org/jira/browse/IMPALA-7877 https://issues.apache.org/jira/browse/IMPALA-8369 so I'll just try to create my own function for Impala.
... View more
01-29-2020
08:35 AM
Hi all, I'm trying to create a function to use in imapla. my function is simply re-using hive's sha2() function. the creation of the function goes smoothly : create function to_sha2(string,int) returns string location 'user/hive/hive.jar' symbol='org.apache.hadoop.hive.ql.udf.generic.GenericUDFSha2' ;
but when I try to use it doesn't work launching this warning : select to_sha2('test',256);
Query State: EXCEPTION
Query Status: ClassCastException: class org.apache.hadoop.hive.ql.udf.generic.GenericUDFSha2
I have tried to search for UDFSha2 that doesn't contain the Generic word in the hive's jar but I couldn't find it. the original built-in function in hive : sha2(string/binary, len) - Calculates the SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, and SHA-512). | the other functions are working normally in impala (for example I have tried to create UDF MD5 function from hive's jar and it was working ). so my question is do I have to create my own UDF-Sha2 function? or there is a saving situation for my case, any help will be appreciated Impala version 2.9 Hive : 1.1.0 CDH : 5.12
... View more
Labels:
- Labels:
-
Apache Impala
01-23-2020
10:34 AM
works perfectly , thanks
... View more