About AcharkiMed

AcharkiMed · ‎07-14-2018

Hi @AkhilD Try with this indirect method: SELECT split_part(input, ':', 1)*3600 + split_part(input, ':', 2)*60 + split_part(input, ':', 3) FROM your_table; Good luck.

Tim Armstrong · ‎07-13-2018

@mcalnd it's in CDH5.8 onwards

alexm · ‎07-12-2018

Syncronizing data between clusters can be accomplished via distcp, BDR, or ingesting data into both clusters simulatenously using 3rd party tools. The best tool depends on your use case, risk tolerance, and budget. We don't recommend spanning clusters across large geographic regions (e.g. US to EU); network latency and bandwidth are usually not suitable and could easily result in the slow query times you're experiencing. We DO support spanning clusters across AWS Availability Zones if certain conditions are met; see Appendix A of Cloudera Enterprise Reference Architecture for AWS Deployments (PDF) details. For comparison, the latency between AWS AZs is typically sub-millisecond. Spanning bare metal clusters across multiple data centers will be addressed in the next release of Cloudera Enterprise Reference Architecture for Bare Metal Deployments (PDF), to coincide with C6. It will look similar to the AWS guidance, but with the additional caveat that network latency between sides should not exceed 10ms. Kudu does not support rack awareness. Not all services provide HA.

maziyar · ‎06-20-2018

As you mentioned correctly Apache Spark is offering MlLib (or ML) which it comes with a set of features for some basic NLP, most popular algorithms for clustering and classifications, etc. But that is not all! You can use many libraries which are released to complete Spark in a domain of Machine Learning and Deep Learning. Basically, these libraries are using Spark APIs and Engine. You can have a look here (or other lists): https://github.com/awesome-spark/awesome-spark#machine-learning-extension

Tim Armstrong · ‎06-12-2018

@mauriciothat's great news! Thanks for the update. We do need to get this documented though.

Harsh J · ‎06-09-2018

It is a recommendation based on the fact that active and standby are merely states of the NameNode and not different daemons. The NameNode doesn't check it's own hardware to be the same as other NameNodes if that's what you are worried about.

AcharkiMed · ‎06-07-2018

Hi @Pettax Thanks, your solution work with me too.

Locarno · ‎05-31-2018

The question mark is a parameter placeholder in ODBC.

AcharkiMed · ‎05-18-2018

Hi @Harsh J It's only in one NodeManager, its happen suddenly without any upgrade in CDH 5.12.0 and even if I upgrade to 5.14.2 the issue persist.. Anyway your solution has resolve the issue. Thank you.

AcharkiMed · ‎04-24-2018

Hi, 1- yes you can do it, like I tell you.. create a external text table on Impala directly then create a parquet table and select from the text one.. (the converting will be done automatically..). 2- I think you can.. try to search about parquet-tools. Good luck.

Online	Offline
Last Visited	‎05-25-2022 11:41 AM

Member Since	‎07-17-2017 07:15 AM
Last Visited	‎05-25-2022 11:41 AM
Posts	143
Kudos received	16

Cloudera Community

Re: What performance to expect from Cloudera VM ?

Re: Impala date

Re: Error 1107

Re: Cannot connect to Impala via ODBC

Re: Getting improper "Unexpected character" using ...

Re: How to convert hh:mm:ss to seconds in impala?

Re: Impalad logs diskspace full

Re: When I add a new rack some Impala queries beca...

Re: Why the Apache Mahout is deprecated and what's...

Re: Very slow CodeGen taking 80% of runtime

Re: HDFS High Availability - is the Active

Re: How to configure Sentry to allow creating a Ku...

Re: Getting improper "Unexpected character" using ...

Re: Yarn NodeManager fails to start and crashing w...

Re: Impala table definition