2209
Posts
230
Kudos Received
82
Solutions
About
My expertise is not in hadoop but rather online communities, support and social media. Interests include: photography, travel, movies and watching sports.
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 502 | 05-07-2025 11:41 AM | |
| 1007 | 02-27-2025 12:49 PM | |
| 2887 | 06-29-2023 05:42 AM | |
| 2427 | 05-22-2023 07:03 AM | |
| 1797 | 05-22-2023 05:42 AM |
04-11-2018
05:57 PM
1 Kudo
There are dirty data in the roles table, I deleted it
... View more
03-28-2018
11:22 AM
All I'm seeing are messages like the following when restarting haproxy: 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy main started.
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy static started.
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impala started.
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impalajdbc started.
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy main started.
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impalajdbc started.
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy app started. And sometimes other messages like the following in the haproxy log - but they don't come in immediately after running an Impala query over JDBC such as from Hue or beeline: 2018-03-27T15:51:47-04:00 172.28.2.234 haproxy[4286]: 172.28.6.234:37768 [27/Mar/2018:15:51:47.621] impalajdbc impalajdbc/impalajdbc1 0/0/+0 +0 -- 2/2/2/1/0 0/0
2018-03-27T18:08:08-04:00 172.28.2.234 haproxy[4286]: 172.28.2.20:39978 [27/Mar/2018:18:08:08.888] impala impala/impalad1 0/0/+0 +0 -- 3/1/1/1/0 0/0 However, no logs come in when accessing from a BI tool such as Tableau using its native Cloudera Impala connector or the Cloudera Impala ODBC driver. Is there a way to increase the logging for haproxy so that we can know which Impala Daemon a query is executing on for the purpose of debugging potential issues of someone accessing from a BI application? I already have it said to debug under the listen section for impalajdbc. listen impalajdbc :21051
mode tcp
option tcplog
balance roundrobin
log 172.28.xx.xx local2 debug
server impalajdbc1 hdp104v.cmssvc.local:21050
server impalajdbc2 hdp105v.cmssvc.local:21050 Thanks, Braz
... View more
03-15-2018
11:24 AM
From the community knowledge article on how to setup the quickstart vm:
4. Some users have reported problems running CentOS 6.4 in VirtualBox. If a kernel panic occurs while the VirtualBox VM is booting, you can try working around this problem by opening the Settings > System > Motherboard tab, and selecting ICH9 instead of PIIX3 for the chip set. If you have not already done so, you must also enable I/O APIC on the same tab.
... View more
03-13-2018
08:38 AM
1 Kudo
Got it. I'm pretty sure this link will get you to what you are looking for.
... View more
03-09-2018
02:36 AM
It's much more friedly that this blog... you can give it a try anytime ... it's free
... View more
02-14-2018
05:20 AM
Thanks for following up @skamalj and congratulations on pasing your certification. As for the email address issue, I want to be sure, the certification team was made aware of it, no? I'm assuming so since your issue was resolved. 🙂
... View more
02-13-2018
09:28 PM
Just quick info you can run pig in local mode as well as in mapreduce mode , By default, load looks for your data on HDFS in a tab-delimited file using the default load function PigStorage. also if you start you pig -x which local mode it will look for local fs . Nice that you found the fix. @SGeorge ,
... View more
02-12-2018
07:22 PM
I am doing some practices, for some questions, there could be various solutions, for example, I can use RDD operations to do some filtering, sorting, and grouping; with DataFrame and SparkSQL, it is even easier to me to get the same result. My question is will there be a requirement in the exam that some questions must be resolved using RDD, not DataFrame+SparkSQL. or vice versa? Thank you.
... View more
01-17-2018
05:43 AM
1 Kudo
Summary: Due to a kernel security exploit on system CPUs (Spectre and Meltdown), OS updates are required for all systems. The OS updates have performance implications. Cloudera is in the process of testing the impact across various Cloudera software services and workloads. Initial test results, based on the limited testing (a subset of services and workloads) we have completed as of January 12, indicate that the performance impact on Cloudera software is minor.
Users affected: While we are not aware of customers who have been impacted by these CPU vulnerabilities, we understand and expect that all Cloudera customers will apply Spectre and Meltdown patches as they become available from their OS suppliers. As such, Cloudera is performing basic functional testing to ensure that CDH works with the patches as well as determining performance impact on typical workloads.
Impact: Cloudera’s initial testing has focused on a subset of commonly used CDH services. Based on the limited testing (a subset of services and workloads) we have completed as of January 12, these CDH services continue to function on patched systems and the performance impact on Cloudera software is minor. For example:
MapReduce jobs run with 3 - 9% slowdown
Impala queries run with 5 - 10% slowdown
Hive on Spark queries run with up to 5% slowdown
Spark jobs run with 0 - 12% slowdown
HBase queries run with up to 5% slowdown
Cloudera understands that the performance of your data clusters can affect important applications and may have business impact. As such, we are taking this situation seriously and treating this with urgency. Cloudera will provide additional performance results as soon as they are available.
Action required: While we are not aware of customers who have been impacted by these CPU vulnerabilities, we understand and expect that all Cloudera customers will apply Spectre and Meltdown patches as they become available from their OS suppliers.
... View more