About AcharkiMed

AcharkiMed · ‎04-24-2018

Hi, 1- yes you can do it, like I tell you.. create a external text table on Impala directly then create a parquet table and select from the text one.. (the converting will be done automatically..). 2- I think you can.. try to search about parquet-tools. Good luck.

AcharkiMed · ‎04-24-2018

Hi @toamitjain, Firstly, I advice you to create the external -TEXTFILE- table too in impala (it's faster to create it in hive!). CREATE EXTERNAL TABLE table1 (columns def) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION 'hdfs_path'; 1- Because of your initial hdfs file probably is a text format file while the parquet tables use the parquet format, so you have do this recreation to enssure the convertision TEXT=>PARQUET (but if your HDFS files are already in parquet format you can create an direct external parquet table on it..). 2- Except the slowness of your queries (because parquet table is very fast than text table) there is no issue, just the table and the hdfs files will be separated.. 3- I think storing the date on timestamps is more profetionnal, so you can benifit from the timestamp functions.. Good luck.

AcharkiMed · ‎04-20-2018

Hi all, Since my first post I remark that the post's views number in the forums increase in every open or refresh of the post. Sometimes, there is a user that use F5 to have a huge number of views in his post!! I think you need a control (by membre, by IP adress and/or cookies...) in this case before the increasing. Thanks for your comprehension.

AcharkiMed · ‎04-20-2018

Hi, Hi, In CDH 5.12.0 and 5.14.2 releases (centos 6.9) the Yarn NodeManager fails to start and crashing with SIGBUS. Here is the error msg in : # # A fatal error has been detected by the Java Runtime Environment: # # SIGBUS (0x7) at pc=0x00007f4d5b1aff4f, pid=20067, tid=0x00007f4d869dd700 # # JRE version: Java(TM) SE Runtime Environment (8.0_144-b01) (build 1.8.0_144-b01) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.144-b01 mixed mode linux-amd64 compressed oops) # Problematic frame: # C [libleveldbjni-64-1-5336493915245210176.8+0x4af4f] snappy::RawUncompress(snappy::Source*, char*)+0x31f # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /var/run/cloudera-scm-agent/process/14104-yarn-NODEMANAGER/hs_err_pid20067.log # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. # Here is the hs_err_pid20067.log file: https://ufile.io/dl8lu JIRA link: https://issues.apache.org/jira/browse/YARN-8190

AcharkiMed · ‎04-19-2018

Hi, Try with Refresh functions; Show functions;

AcharkiMed · ‎04-19-2018

Take a look to this: http://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Impala-ODBC-JDBC-bad-performance-rows-fetch-is-very-slow-from-a/m-p/61152#M3751 Good luck.

AcharkiMed · ‎04-19-2018

Hi @Tim Armstrong Thank you for you interaction. Here is the JVM error dump file: https://ufile.io/j0zat I have formatted 2 servers and resit them to the centos 6.9 (kernel 2.6.32-696.23.1.el6.x86_64) but always the same problem! I hope we can resolve this bug asap, good luck.

AcharkiMed · ‎04-18-2018

Hi all, Finaly and after almost 6 months I have found the solution! It was always about my 1024 limitition remark, the row batch limitation was from BATCH_SIZE max value (1024), in the last versions (CDH 5.14/Impala 2.11) we have a new effective range is 1-65536. 1-1024: https://www.cloudera.com/documentation/enterprise/5-12-x/topics/impala_batch_size.html 1-65536: https://www.cloudera.com/documentation/enterprise/5-14-x/topics/impala_batch_size.html So when I increase it throgh a odbc.ini with SSP_BATCH_SIZE I can benifit from increasing the other odbc parameters (RowsFetchedPerBlock / TSaslTransportBufSize) and the rows can be fetched in a seconds (~45 secs) instead of tens of minutes. Remark: I have been recreated the cluster in 3 different server providers and tested the connections from almost 5 others with different ODBC/JDBC releases etc.. and always I have the same slowness until this update came. I can not understand why I'm the only on declared this big issue and why no one can answer me, kowning that it's realy depressed to have a good quering engine but a veeeery slow fetch rows! Any way, thanks all for your replies.

AcharkiMed · ‎04-18-2018

Hi @JJH, I think you have to rename it using the impala statement below: ALTER TABLE your_table RENAME TO new_table_name; Good luck.

AcharkiMed · ‎04-18-2018

Hi @Tim Armstrong Here is the CPU info from impalad.INFO : I0417 20:54:12.845438 13375 init.cc:230] Cpu Info: Model: Intel(R) Xeon(R) CPU E5405 @ 2.00GHz Cores: 8 Max Possible Cores: 8 L1 Cache: 32.00 KB (Line: 64.00 B) L2 Cache: 6.00 MB (Line: 64.00 B) L3 Cache: 0 (Line: 0) Hardware Supports: ssse3 sse4_1 Numa Nodes: 1 Numa Nodes of Cores: 0->0 | 1->0 | 2->0 | 3->0 | 4->0 | 5->0 | 6->0 | 7->0 |

Online	Offline
Last Visited	‎05-25-2022 11:41 AM

Member Since	‎07-17-2017 07:15 AM
Last Visited	‎05-25-2022 11:41 AM
Posts	143
Kudos received	16

Cloudera Community

Re: What performance to expect from Cloudera VM ?

Re: Impala date

Re: Error 1107

Re: Cannot connect to Impala via ODBC

Re: Getting improper "Unexpected character" using ...

Re: Impala table definition

Re: Impala table definition

Post's views number need a control!

Yarn NodeManager fails to start and crashing with ...

Re: How to list all UDFs?

Re: Impala JDBC 10x Slower Vs. Shell

Re: After upgrading to cdh 5.14.2 Impala daemon st...

Re: Impala ODBC/JDBC bad performance - rows fetch ...

Re: How to rename kudu table name on impala versio...

Re: After upgrading to cdh 5.14.2 Impala daemon st...