Member since
07-02-2018
5
Posts
0
Kudos Received
0
Solutions
08-27-2019
03:07 PM
1 Kudo
Hello @EranK Here is an example of using a combination of HDFS and HBase to manage geospatial data - you may find their architecture of interest: https://www.slideshare.net/Hadoop_Summit/grailer-hochmuth-june27515pmroom212v3
... View more
08-20-2019
06:33 AM
1 Kudo
Hi @EranK, can you please double-check the Windows Server version with your Active Directory team? The releases of Windows Server include 2012 and 2012 RC2 but not 2013. Hence, you might be using Windows Server 2012 (or 2012 RC2) which fits to the referenced documentation page. While I can not provide you with a support matrix, from personal experience I know that a Windows Server 2012 KDC does work together with Cloudera. However, pay close attention to the chosen encryption types to choose ones that are supported / activated in your specific Active Directory. Regards Benjamin
... View more
03-21-2019
02:11 AM
Hello @EranK, These are default users and groups created during setting up CDH and CM. You may choose to review Hadoop Users in Cloudera Manager and CDH for more details on the same. Hope that helps.
... View more
03-07-2019
08:42 PM
Yes the individual components (such as Apache NiFi) are free to use and provided under an open source license (under APLv2). Are you asking specifically about its deployment integration with Cloudera Manager (Express)?
... View more
07-05-2018
01:17 PM
Hi, Kudu can certainly scale to tens of thousands of point queries per second, similar to other NoSQL systems. For example, in preparing the slides posted on https://kudu.apache.org/2017/10/23/nosql-kudu-spanner-slides.html I ran a random-read benchmark using 5 16-core GCE machines and got 12k reads/second. Since then we've made significant improvements in random read performance and I expect you'd get much better than that if you were to re-run the benchmark on the latest versions. In a more recent benchmark on a 6-node physical cluster I was able to achieve over 100k reads/second. Keep in mind that such numbers are only achievable through direct use of the Kudu API (i.e Java, C++, or Python) and not via SQL queries through an engine like Impala or Spark. Typically those engines are more suited towards longer (>100ms) analytic queries and not high-concurrency point lookups. -Todd
... View more