Member since
07-17-2017
143
Posts
16
Kudos Received
17
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1465 | 07-03-2019 02:49 AM | |
1657 | 04-22-2019 03:13 PM | |
1390 | 01-30-2019 10:21 AM | |
8047 | 07-25-2018 09:45 AM | |
7146 | 05-31-2018 10:21 AM |
04-24-2018
05:41 AM
Hi, 1- yes you can do it, like I tell you.. create a external text table on Impala directly then create a parquet table and select from the text one.. (the converting will be done automatically..). 2- I think you can.. try to search about parquet-tools. Good luck.
... View more
04-24-2018
03:05 AM
1 Kudo
Hi @toamitjain, Firstly, I advice you to create the external -TEXTFILE- table too in impala (it's faster to create it in hive!). CREATE EXTERNAL TABLE table1 (columns def)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION 'hdfs_path'; 1- Because of your initial hdfs file probably is a text format file while the parquet tables use the parquet format, so you have do this recreation to enssure the convertision TEXT=>PARQUET (but if your HDFS files are already in parquet format you can create an direct external parquet table on it..). 2- Except the slowness of your queries (because parquet table is very fast than text table) there is no issue, just the table and the hdfs files will be separated.. 3- I think storing the date on timestamps is more profetionnal, so you can benifit from the timestamp functions.. Good luck.
... View more
04-20-2018
08:33 AM
Hi all, Since my first post I remark that the post's views number in the forums increase in every open or refresh of the post. Sometimes, there is a user that use F5 to have a huge number of views in his post!! I think you need a control (by membre, by IP adress and/or cookies...) in this case before the increasing. Thanks for your comprehension.
... View more
04-20-2018
04:49 AM
1 Kudo
Hi, Hi, In CDH 5.12.0 and 5.14.2 releases (centos 6.9) the Yarn NodeManager fails to start and crashing with SIGBUS. Here is the error msg in : #
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGBUS (0x7) at pc=0x00007f4d5b1aff4f, pid=20067, tid=0x00007f4d869dd700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_144-b01) (build 1.8.0_144-b01)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.144-b01 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libleveldbjni-64-1-5336493915245210176.8+0x4af4f] snappy::RawUncompress(snappy::Source*, char*)+0x31f
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /var/run/cloudera-scm-agent/process/14104-yarn-NODEMANAGER/hs_err_pid20067.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
# Here is the hs_err_pid20067.log file: https://ufile.io/dl8lu JIRA link: https://issues.apache.org/jira/browse/YARN-8190
... View more
Labels:
04-19-2018
04:27 AM
Take a look to this: http://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Impala-ODBC-JDBC-bad-performance-rows-fetch-is-very-slow-from-a/m-p/61152#M3751 Good luck.
... View more
04-19-2018
02:07 AM
Hi @Tim Armstrong Thank you for you interaction. Here is the JVM error dump file: https://ufile.io/j0zat I have formatted 2 servers and resit them to the centos 6.9 (kernel 2.6.32-696.23.1.el6.x86_64) but always the same problem! I hope we can resolve this bug asap, good luck.
... View more
04-18-2018
02:27 AM
2 Kudos
Hi all, Finaly and after almost 6 months I have found the solution! It was always about my 1024 limitition remark, the row batch limitation was from BATCH_SIZE max value (1024), in the last versions (CDH 5.14/Impala 2.11) we have a new effective range is 1-65536. 1-1024: https://www.cloudera.com/documentation/enterprise/5-12-x/topics/impala_batch_size.html 1-65536: https://www.cloudera.com/documentation/enterprise/5-14-x/topics/impala_batch_size.html So when I increase it throgh a odbc.ini with SSP_BATCH_SIZE I can benifit from increasing the other odbc parameters (RowsFetchedPerBlock / TSaslTransportBufSize) and the rows can be fetched in a seconds (~45 secs) instead of tens of minutes. Remark: I have been recreated the cluster in 3 different server providers and tested the connections from almost 5 others with different ODBC/JDBC releases etc.. and always I have the same slowness until this update came. I can not understand why I'm the only on declared this big issue and why no one can answer me, kowning that it's realy depressed to have a good quering engine but a veeeery slow fetch rows! Any way, thanks all for your replies.
... View more
04-18-2018
01:52 AM
Hi @JJH, I think you have to rename it using the impala statement below: ALTER TABLE your_table RENAME TO new_table_name; Good luck.
... View more
04-18-2018
01:46 AM
Hi @Tim Armstrong Here is the CPU info from impalad.INFO : I0417 20:54:12.845438 13375 init.cc:230] Cpu Info:
Model: Intel(R) Xeon(R) CPU E5405 @ 2.00GHz
Cores: 8
Max Possible Cores: 8
L1 Cache: 32.00 KB (Line: 64.00 B)
L2 Cache: 6.00 MB (Line: 64.00 B)
L3 Cache: 0 (Line: 0)
Hardware Supports:
ssse3
sse4_1
Numa Nodes: 1
Numa Nodes of Cores: 0->0 | 1->0 | 2->0 | 3->0 | 4->0 | 5->0 | 6->0 | 7->0 |
... View more