Member since
04-21-2015
49
Posts
2
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6345 | 08-24-2015 04:51 AM |
05-09-2016
08:59 AM
1 Kudo
Hello So sorry for delayed update. invalidate metadata
invalidate metadata tablename and then
refresh tablename commands have solved my problem. Source parquet tables and gzipped target tables have same records in their partitions. I am still getting "split into multiple hdfs-blocks problem" warnings but it looks like it does not any impact on my record count issue. BTW : The link that you provided is very good Thanks for your response
... View more
05-09-2016
07:49 AM
Hello I am so sorry but I could not find out change passsword section. How can I change my password ? Actually I also want to change my security question Thanks
... View more
04-11-2016
12:17 AM
Hello I am trying to import parquet tables from another Cloudera Impala implementation to my Cloudera Impala --> I am getting parquet tables via sftp --> I am copying all parquet files into proper impala table directory like /grid1/hive/warehouse/<database>/<importedTable> without any error/warning --> I am creating required partition structure with alter table <importedTable> add partition (..) without any error/warning --> I am applying refresh <importedTable> command without any error/warning --> I could see new partitions in (show partition <importedTable> command) without any error/warning --> I am applying above procedure for all tables --> When I tried to access records in the table I got following warning "WARNINGS: Parquet files should not be split into multiple hdfs-blocks" I am using gzip compression on my tables but imported tables have default settings. So I have another database with gzipped data. Therefore I am copying data from imported table to gzipped table with following command set compression_codec=gzip without any error/warning insert into <gzippedTable> partition (<part1=value1, part2=value2) select field1, field3, field4 ...... from <importedTable> where <partitioned column1=value1, partitioned column2=value2) without any error/warning When I compare record counts for the partition both gzippedtable and imported table, there is a differences like following output [host03:21000] > select count (*) from importedTable where logdate=20160401; Query: select count (*) from importedTable where logdate=20160401 +-----------+ | count(*) | +-----------+ | 101565867 | +-----------+ WARNINGS: Parquet files should not be split into multiple hdfs-blocks. file=hdfs://host01:8020/grid1/hive/warehouse/<database>/importedTable/partitionedColumn=value1/logdate=20160401/51464233716089fd-295e6694028850a0_1358598818_data.0.parq (1 of 94 similar) Fetched 1 row(s) in 0.96s [host03:21000] > select count (*) from gzippedTable where logdate=20160401; Query: select count (*) from gzippedTable where logdate=20160401 +-----------+ | count(*) | +-----------+ | 123736525 | +-----------+ Fetched 1 row(s) in 0.92s So how can I fix "WARNINGS: Parquet files should not be split into multiple hdfs-blocks" and why I am getting different record counts after applying above procedure. Is record count differences related with multiple hdfs-blocks warning ? Thanks
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Impala
-
HDFS
01-14-2016
12:20 AM
Hello My colleague runs big SQLs through HUE interface on Impala (he knows only sql 🙂 ). Sometimes we get following error on hue interface. Does this error related with impala ? If yes, is there any way to fix that ? AnalysisException: Exceeded the maximum number of child expressions (10000). Expression has 12061 children: CASE WHEN (longitude BETWEEN 28.4360479702 AND 28.4480394711) AND (latitude BETW... Above error occured while he was running about 12000 lines of SQL 🙂 Best regards
... View more
Labels:
- Labels:
-
Apache Impala
-
Cloudera Hue
10-20-2015
09:29 AM
Hello Is there any reason to not use JDBC ? I could connect from SAP BO to Impala with 32 Bit Impala JDBC drivers Thanks
... View more
09-21-2015
10:49 PM
Hello I fixed db connection problem What kind of error did you get ? Did you check the logs ? Regards
... View more
08-24-2015
04:51 AM
Solved I don't understant it's exact solution or not but I could successfully deployed service configuration to clients after fixing Hive Metastore Canary database connection problem Thanks
... View more
08-24-2015
01:02 AM
Hello I have 3 nodes of Cloudera cluster. All nodes have following version of CM components cloudera-manager-agent.x86_64 cloudera-manager-daemons.x86_64 cloudera-manager-server.x86_64 Version : 5.4.3 Release : 1.cm543.p0.258.el6 I always get same error for every CM Cluster service When I try to deploy client configuration I get following errors Failed to execute command Deploy Client Configuration on service HDFS Completed only 0/3 steps. First failure: Client configuration (id=7) on host hadoop1 (id=3) exited with 1 and expected 0. Thanks msuluhan
... View more
Labels:
- Labels:
-
Cloudera Manager
-
HDFS
08-21-2015
02:34 AM
Hello I changed directory permission to 755 and then change ACL of it, then it's working now. I will restart whole environment and then check again Thanks for your inputs msuluhan
... View more
08-14-2015
05:48 AM
Hello I already tried it :S no change, same error I just changed owner of files in /var/lib/zookeper/version2 directory with zookeeper:zookeeper while permission of /var/lib/zookeper is 755. But I got same error Any other idea? Thanks
... View more