Member since
08-17-2018
39
Posts
3
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3804 | 08-28-2018 07:24 AM |
08-26-2019
08:56 AM
1 Kudo
Just a note about the CM API: From what I can tell, the API doesn't bring more parameters back that what you can see in the Configuration tab for each app/service. Being able to see them in the INFO logs was exactly what was needed. However, it would be nice to be able to use the API to get the same info as the INFO logs provides. I could see some automation opportunities in the future...
... View more
08-20-2019
10:37 AM
Not yet but I have a good bit of experience writing Python code using the CM API. I'll look into it. If there's anything useful, it might be helpful to post my findings here...
... View more
08-20-2019
07:44 AM
1 Kudo
Hey Ben, One very key thing you mentioned was the fact that the INFO logs write out the parameter settings. Not sure if that includes defaults as I have not researched it yet but this is extremely useful as I may not have enough time to figure out why the servlets/services are not running or are not available but being able to see the parameter settings this way is more then sufficient. For anyone needing a way to quickly see the parameters separated from the rest of the log information, you can use this egrep pattern I put together. Hasn't been thoroughly tested but works so far: egrep '^--\w+=\w?+' /var/log/impalad/impalad.INFO This command extracted 229 parameters from our INFO log file. Sounds like a pretty complete list???
... View more
08-19-2019
12:24 PM
Hey Ben, > most services expose a "/conf" servlet in their WebUI which gives you the most complete set of actually used parameters. This should be the most promising source of thruth. Impala has a similar Servlet with path "/varz". I've been looking for this /varz path for Impala and cannot find the correct path. Can you give me the port and path that it should be on? Unless it's a name that is given during setup and it's different than "/varz"??? > The instance process view of Cloudera Manager shows you the actually distributed config files - which often helps a lot but does not include default values. You can reach it from a service (e.g. Impala) by clicking on "Instances" -> (the instance you want to see, e.g. Impala Catalog Server on a node) -> Processes I've seen this but I don't see anything that comes close to what I need. This is the list of Impala related config files: impala-conf/fair-scheduler.xml impala-conf/llama-site.xml impala-conf/sentry-site.xml impala-conf/log4j.properties impala.keytab impala-conf/.htpasswd impala-conf/impalad_flags The only thing that comes close is impala-conf/impalad_flags file and that has no link. Thanks!
... View more
08-19-2019
07:05 AM
Hi, As there are what seems to be 1000s of settings for HDFS, Hive, CM, Hue, Spark, etc. I was wondering if there is a way to see the actual settings that are being used for all applications in the CM/Hadoop suite? I know you can see some of them in config files, some of them using CM's interface. But not all of them are in both. Defaults, for example, don't show up anywhere as sometimes code sets them. CM does not cover all the parameters as it has a special location it stores Safety Valve settings, as I've been told??? I would like to be able to see what is resident in memory and actually being used as you may add a setting and it not take as you could have put it in the wrong area or is being overridden by the same setting somewhere else. Is there a way to actually see what is being used for any/all parameters in the suite? Thanks!
... View more
Labels:
- Labels:
-
Cloudera Manager
08-19-2019
06:22 AM
>" Your cluster does sound unhappy" LOL. I'd say more like pi$$ed. 🙂 > If it's a JVM issue, we've seen in some cases that increasing heap sizes helps. Setting ipc.client.rpc-timeout.ms to 60000 I'd say it's more like a set of overloaded namenodes. And, according to the research I've done so far, for another problem, I have a theory that I can use a more up to date GC and it should increase the performance and reduce the number of Zookeeper failures we have. Our ZK failures are sporadic and happen every 2 to 3 days. Sometimes more sometimes less. Moving ZK to separate nodes is not an option at this point and I'm doing all I can to try to reduce the number of failures short of moving the service. I'll check our settings on this and see if we can do one or both. I suspect we have increased JVM heap already, but not sure? >We've also seen the file handle cache that got enabled by default in CDH5.15 help a lot in reducing namenode load, I assume this is available before this version but was not enabled by default??? I'll look it up and see... > I agree 100%. I think whoever named it was either overly optimistic and assumed there wouldn't be a significant gap in time, or it was named from the point of view of the code rather than the external system So, my question is, is there an indicator in the Query Details that indicates something was returned? I know I get an initial set of results back. Without that "fetch" metric meaning what the word actually says, I don't know what indicates how long it took to get the first set of records back??? Back to the original issue... Given that the issue appears to be the last query issued in Hue tends to show up as still executing 2.2 hours later and has already returned a count almost immediately. Obviously, the parameters for idle timeouts for sessions and queries is not marking the session as closed. Therefore appearing to still be executing: Is this causing resource issues because the session is being held open and appearing to be still executing? I would assume so as it is waiting on a fetch of subsequent sets of records??? What parameter(s) will close the session from the last query executed? Just to let you know, I've come in late to the game and am still learning CM and Cloudera Manager. I understand a lot but with 1000s of parameters for all the apps and an ample supply of problems, it'll take a while. 🙂 Thanks for all your help. It is nice to have a response on this forum. The last couple posts were not very useful. We do have a service contract and although I am one of the 3 admins, they are working on adding me to the support contract so I can put in tickets and get support directly. Until then, I appreciate the help!
... View more
08-16-2019
01:49 PM
Cool. It's hard to believe a count of 1000 records is taking 2.2 hours. So, I closed the session and did not see the mentioned " Released admission control resources" value. It would be very helpful to know why this did not show up and what needs to be done to release the resources... BTW, it's not holding up queries from getting admitted as far as I can tell. We ran into a problem where it did not have enough memory to allocate to a query and returned an error. That's what got me started down this road in the first place. When you say "after the first row was fetched" what constitutes a fetch? When I execute a query, I get returned results and I see the records. I assume the fetch is automatic up to a certain number as it will page the rest of them. To me, when I see records, a fetch has taken place??? This makes zero sense: "one quirk of the "first row fetched" event is that it tracks when the row was requested, not when it was returned)" as the very word fetch means it got a result and pulled it back for viewing. Why would the word "fetch" be used in place of the word "requested?" I get a result back of 1000. There is no more to fetch from the query. That's it. It appears that the fetch occurred, I got my result and I'm completely done with the query. That's when the resources should be released. It sounds like you're saying the interface in Impala is showing incorrectly and has been fixed? This would be the whole problem, correct? Thanks!
... View more
08-16-2019
12:21 PM
I apologize for missing your request to get the query profile. I can send you a new one as this is easily duplicatable. I have already re-executed a query to see if the profile was that different... Now, I re-executed the same query which is a count and returns 1 row. This query will show in the Impala Query monitor screen as Executing indefinitely. I can close my session or re-execute another query and it will mark it as Finished. Here's the same one 45 minutes into waiting on it too close. It will not close unless I do something to make it close. I waited over 2 hours the last time I tried and it still said it was Executing: Query: select count(*) from web_logs Query Info Query ID: 184e512db38f7215:e361d04d00000000 User: <user> Database: default Coordinator: <host> Query Type: QUERY Query State: FINISHED Start Time: Aug 16, 2019 1:24:05 PM Duration: 47m, 43s Rows Produced: 1 Admission Result: Admitted immediately Admission Wait Time: 0ms Bytes Streamed: 64 B Client Fetch Wait Time: 6.59s Client Fetch Wait Time Percentage: 0 Connected User: hue/<host>@<domain> Delegated User: <user> Estimated per Node Peak Memory: 52.0 MiB File Formats: TEXT/NONE HDFS Average Scan Range: 101.1 KiB HDFS Bytes Read: 404.5 KiB HDFS Scanner Average Read Throughput: 1.4 GiB/s Impala Version: impalad version 2.11.0-cdh5.14.2 RELEASE (build ed85dce709da9557aeb28be89e8044947708876c) Network Address: <IP>:<Port> Node with Peak Memory Usage: <server>:<port> Out of Memory: false Per Node Peak Memory Usage: 196.4 KiB Planning Wait Time: 18ms Planning Wait Time Percentage: 0 Pool: root.<user> Query Status: OK Session ID: 7744f8b1bd92eb67:c892fbb3fbb81293 Session Type: HIVESERVER2 Statistics Corrupt: false Statistics Missing: true Threads: CPU Time: 75ms Threads: CPU Time Percentage: 50 Threads: Network Receive Wait Time: 32ms Threads: Network Receive Wait Time Percentage: 22 Threads: Network Send Wait Time: 1ms Threads: Network Send Wait Time Percentage: 1 Threads: Storage Wait Time: 40ms Threads: Storage Wait Time Percentage: 27 Threads: Total Time: 150ms Download Profile... Query Timeline Query submitted: 49.51us (49.51us) Planning finished: 18ms (18ms) Submit for admission: 19ms (733.33us) Completed admission: 19ms (134.25us) Ready to start on 4 backends: 19ms (521.71us) All 4 execution backends (5 fragment instances) started: 22ms (2ms) Rows available: 72ms (50ms) First row fetched: 6.66s (6.59s) Planner Timeline Analysis finished: 16ms (16ms) Value transfer graph computed: 16ms (27.31us) Single node plan created: 16ms (385.49us) Runtime filters computed: 16ms (54.11us) Distributed plan created: 16ms (24.85us) Lineage info computed: 16ms (26.54us) Planning finished: 17ms (467.37us) Is this sufficient?
... View more
08-16-2019
10:47 AM
Hi Tim, This would be a good time to mention were are on this version of CM: CDH-5.14.2-1.cdh5.14.2.p0.3 Are you telling me that people that are executing a query in Hue, monitor the query in CM ---> Impala ---> Queries and after the idle timeouts are up, then the query shows it is not running anymore? Try this: Execute a query in Hue like a count on a table and monitor the status of the Impala query in CM. Let me know if it actually times out. I discovered that the last statement executed in Hue will stay there forever or until you close the session or execute another statement. So, either of these parameters does not stop the session after the timeout period as they say they are supposed to do: -idle_session_timeout=3600 -idle_query_timeout=3600 Look at the attachment (apparently I can't past images in as I exceeded 100,000 characters??? This is actually how fast the results were returned and the query finished: First row fetched: 838ms (743ms) I assume you have a test bed you can set the setting to a much lower number to see if it is still showing as running in the Impala query monitor tool in CM. We are having sporadic resource issues because these resources are not being released. The more people that run a query and don't purposely close out the session, the more resources are being consumed. So, all the parameters I mentioned before don't help a bit? It's all up to these 2 parameters? Maybe they are not being loaded??? I was looking for a way to hit the site and get all current parameter settings??? I know Hue does not have an API, but is there a way to see if it actually is loading those parameters? Thanks for your help!
... View more
08-16-2019
07:57 AM
It's a bit late in the game but I'm running into the same problem where the query appears to be running for hours and the first row fetched in in seconds. This means it is not actually running although the list of Impala queries says it is. As previously stated by a poster, the user did not close the session. I have just noticed that the last query executed will hold the query in a running state. Once another query is executed or the session closed, it will release the resources and mark the query as finished. I have another post about this same issue. Neither one of these 2 parameters have helped: -idle_session_timeout=1500 -idle_query_timeout=1500 So, my conclusion is the documentation is not accurate in what it says about these parameters or there's a bug as of 08/16/2019??? If you find out how to close a session on a query automatically using a parameter, let me know...
... View more
08-16-2019
06:44 AM
Hi Chelsea, I see memory being held, I believe in the CM Impala query interface along with a supposedly running query. I assume that if I see a query still saying it is executing that although it's just waiting to send the next page of results that memory is being held for sure. First, the query_timeout_s parameter has not been set, so if it's 10 minutes or 5 minutes like what our current configuration says, it's still not working: # Hue will try to close the Impala query when the user leaves the editor page. # This will free all the query resources in Impala, but also make its results inaccessible. ## close_queries=true # If > 0, the query will be timed out (i.e. cancelled) if Impala does not do any work # (compute or send back results) for that query within QUERY_TIMEOUT_S seconds. ## query_timeout_s=600 # If > 0, the session will be timed out (i.e. cancelled) if Impala does not do any work # (compute or send back results) for that session within QUERY_TIMEOUT_S seconds (default 1 hour). ## session_timeout_s=3600 This parameter is having no effect, correct? Whatever settings are being used are not working as I can see many related settings but none of them are set to allow a query to be active for 15 hours. Also, what about the # close_queries=true option? Will that do what we need? However, I have a question about this parameter... Will this kill the query and release the results if we set it to an hour as we need it to be? # Users will automatically be logged out after 'n' seconds of inactivity. # A negative number means that idle sessions will not be timed out. idle_session_timeout=-1 If not, any parameters that we haven't talked about that could stop this long retention period of resources? BTW, I could not find this command. Where should I look? build/env/bin/hue close_queries --help I could write a script theoretically that would check query duration to see if any are long running and kill them if need be. I know Hue has no API as I have experienced. I wrote a Python app that took a list of users and removed them automatically from CM and Hue. I had to use the Requests module in Python to load the existing users and bounce that off the users to be deleted. Then, create a POST request that deleted the existing users that have left the company. Quite painful but fun... 🙂 Just so you guys know, I'm here on this forum because all the documentation I read says the 2 parameters we have set should close the queries and return the resources after the timeout. Neither one of the 2 works. So documentation is not accurate or something else is wrong. Thanks!
... View more
08-15-2019
12:13 PM
I'm going to take a poke at this and hope I'm not wasting your time... This looks like a good start: https://kudu.apache.org/docs/developing.html I have done minimal Python development using PySpark and Kudu. It's not too bad...
... View more
08-15-2019
12:11 PM
I'm going to take a poke at this and hope I'm not wasting your time... This looks like a good start: https://kudu.apache.org/docs/developing.html I have done minimal Python development using PySpark and Kudu. It's not too bad...
... View more
08-15-2019
12:04 PM
Hi,
This is a continuation of a previous post titled " Impala Queries Executing long time" in 2017 in which the Cloudera employee explained why query would appear running but not actually be running. In summary, he said the paging function in Hue will leave the query open and appear to be running.
He also said to fix this you need to set some parameters that will force a timeout to occur:
"You can also set idle query timeout and idle session timeout in impala advance snippet to force timeout for queries running from hue."
Unfortunately, this is not true as the state of the query is not in an exact state that it requires for the timeout to occur.
In our case, we have the following settings in Impala:
-idle_session_timeout=3600 -idle_query_timeout=3600
This is set in a field with the label:
"Impala Daemon Command Line Argument Advanced Configuration Snippet (Safety Valve)"
I just killed a query that appeared to be running for 15 hours. The interesting thing about that query is, it had a LIMIT 50 clause.
I can't imagine 50 records taking 15 hours in most scenarios...
We regularly have queries exceed our verbal agreement of 1 hour before we kill the job.
The question is rather obvious but, since I know these settings do not work in this scenario and maybe something similar, not sure, what will stop Hue from holding the resources for so long?
Thanks!
... View more
Labels:
- Labels:
-
Apache Impala
-
Cloudera Hue
08-02-2019
08:23 AM
Hi LiWang, Yes I have checked the audit logs of Impala. I ran a grep -i check looking for "OPERATION_TEXT" as a part of a SQL statement that was executed based on the error I was getting in the log and did not find any log entries with that pattern. I used "ignore case" with grep just to make sure. Since this is a Navigator record that Impala is trying to insert I tried looking into those logs as well. Since I don't have permission to look into those logs, I modified the URL in CM's interface so I could see the log content. Very slow and cumbersome but never found any thing that was very helpful as far as the actual SQL statement that was failing. Another thing to note is the impala_audit_event_log_1.0-xxx files contain no INSERT statements at all. It's hard to believe no inserts have happened using Impala at all. The reason I was looking for an Insert command is my assumption is the error indicated it was trying to insert into the MySQL audit table and was not able to because of the character set issue. The log that showed the error did not show the whole SQL statement, unfortunately, but I did get a few characters of the offending characters in hex form and the column (OPERATION_TEXT) it we trying to insert into. Any ideas? Thanks!
... View more
07-30-2019
11:24 AM
Hi,
I hope this can be answered quickly. I know there's someone out there that would know this...
I am having an error with Impala because of a non-ascii character(s) in a SQL statement trying to execute in the audit service.
ERROR is:
[NavigatorServer-4949170]: Incorrect string value: '\xEF\xBF\xBDCOM...' for column 'OPERATION_TEXT' at row 1
I have fixed this issue before by converting the character set of the target audit table (for example,
IMPALA_AUDIT_EVENTS_2019_07_19) to utf8mb4.
Also, converted the OPERATION_TEXT column to utf8mb4 character set as well...
The question:
Where can I see the full SQL statement?
NOTE: The statement in error does not show up in the CM interface for Impala under the Queries tab.
Thanks!
... View more
Labels:
- Labels:
-
Apache Impala
-
Cloudera Manager
07-22-2019
09:05 AM
UPDATE: As of Spark 2.4, the context.py code has been changed to require an authentication token. However, I'm not sure how to set this token as I have looked in Cloudera Manager, on the web and in the files and cannot find it anywhere. Will someone from Cloudera please help us setup this requirement as the code clearly requires it? Thanks!
... View more
07-19-2019
08:51 AM
Hi, I have been researching for a few days on why we cannot execute any Python code in the PySpark interface inside Hue. PySpark command: from pyspark import SparkContext Error Message: stdout:
stderr:
WARNING: User-defined SPARK_HOME (/opt/cloudera/parcels/SPARK2-2.4.0.cloudera2-1.cdh5.13.3.p0.1041012/lib/spark2) overrides detected (/opt/cloudera/parcels/SPARK2/lib/spark2/).
WARNING: Running spark-class from user-defined location.
19/07/19 07:59:45 WARN spark.SparkConf: The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead.
19/07/19 07:59:46 WARN spark.SparkConf: The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead.
19/07/19 07:59:46 WARN rsc.RSCConf: Your hostname, usbda04.unix.rgbk.com, resolves to a loopback address, but we couldn't find any external IP address!
19/07/19 07:59:46 WARN rsc.RSCConf: Set livy.rsc.rpc.server.address if you need to bind to another address.
19/07/19 07:59:49 WARN util.Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
19/07/19 07:59:49 WARN util.Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
YARN Diagnostics:
sys.exit(main())
File "/tmp/2588781570290623481", line 589, in main
sc = SparkContext(jsc=jsc, gateway=gateway, conf=conf)
File "/opt/cloudera/parcels/SPARK2-2.4.0.cloudera2-1.cdh5.13.3.p0.1041012/lib/spark2/python/lib/pyspark.zip/pyspark/context.py", line 121, in __init__
ValueError: You are trying to pass an insecure Py4j gateway to Spark. This is not allowed as it is a security risk.
YARN Diagnostics: We recently update Spark from 2.3 to 2.4. However, I am not sure if it was working with 2.3. We also recently activated Kerberos. I am not sure what this message is saying but my guess is a configuration is setup to send requests to a specific server (the gateway) and it's not SSL encrypted on the target server, so there is a rule setup to avoid sending requests to non-SSL services? If this is the case, it's not important that the traffic be encrypted as this is a development server. Any theories would be most helpful as I can investigate. The problem is, there is no one that has had this problem according to the research I have done. One other thing to note, which may no bearing on it at all, but I cannot execute pyspark from the command line as I get what appears to be a very old (2016) bug: $ pyspark
Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
Traceback (most recent call last):
File "/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/lib/spark/python/pyspark/shell.py", line 30, in <module>
import pyspark
File "/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/lib/spark/python/pyspark/__init__.py", line 41, in <module>
from pyspark.context import SparkContext
File "/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/lib/spark/python/pyspark/context.py", line 33, in <module>
from pyspark.java_gateway import launch_gateway
File "/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/lib/spark/python/pyspark/java_gateway.py", line 31, in <module>
from py4j.java_gateway import java_import, JavaGateway, GatewayClient
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 656, in _load_unlocked
File "<frozen importlib._bootstrap>", line 626, in _load_backward_compatible
File "/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 18, in <module>
File "/opt/cloudera/parcels/Anaconda-5.1.0.1/lib/python3.6/pydoc.py", line 59, in <module>
import inspect
File "/opt/cloudera/parcels/Anaconda-5.1.0.1/lib/python3.6/inspect.py", line 361, in <module>
Attribute = namedtuple('Attribute', 'name kind defining_class object')
File "/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/lib/spark/python/pyspark/serializers.py", line 381, in namedtuple
cls = _old_namedtuple(*args, **kwargs)
TypeError: namedtuple() missing 3 required keyword-only arguments: 'verbose', 'rename', and 'module' Also, this article comes the closest to the possible issue we're having: https://community.cloudera.com/t5/Web-UI-Hue-Beeswax/Issue-with-PySpark-and-Hue/m-p/52792#M2162 However, I don't know how to check to see if Kerberos is setup and setup properly for this purpose. Any guidance is appreciated on this as well. Any ideas/help would be much appreciated! Thanks!
... View more
Labels:
- Labels:
-
Apache Spark
02-27-2019
11:36 AM
Hey pd, Unfortunately, I'm not a Java programmer. I do PHP, Python and bash and decided not to develop in Java many years ago as it appeared to be way too complex compared to other solutions. Also, at that point in time, Java was still having speed issues being psuedo compiled. The language I was using at that time was fully compilable. Also, I assumed that Cloudera would get around to it eventually, as it's a part of the tools that appear to be under the CM support umbrella. Or, at least it's installed with the Kafka CSD. I would think some kind of CSD to generically allow scripts to be run based on service/role events would be ideal??? Thanks for finally getting back to me! Have a great day!
... View more
02-19-2019
12:28 PM
I can't name the parameter off the top of my head but I believe there is a parameter in configuration that limits the size of an event??? Hope that helps!
... View more
02-19-2019
12:25 PM
Hi,
I have installed and setup Kafka (KAFKA-3.1.1-1.3.1.1.p0.2) in Cloudera Manager ( Cloudera Enterprise 5.14.3) successfully. I have also configured and setup a Splunk connector to allow Splunk to consume Cloudera Audit data.
However, I have to manually launch the connect-distributed.sh script and register the Splunk Sink connector if something fails. If the server is restarted I would have log into the server and manually run the 2 commands (curl) to get the distributed service (or maybe I should call it a role) running and to register it with the Splunk service.
Is there a way to run scripts automatically when Cloudera Manager is used to restart Kafka?
If not, I'm thinking I will create a Python based framework that runs in cron and checks the health of the connect-distributed.sh service and re-run it if it is down.
Thanks!
... View more
Labels:
- Labels:
-
Apache Kafka
-
Cloudera Manager
02-19-2019
10:59 AM
Although I don't have this problem, I would like to know the answer to this. It would definitely be good to know. Unfortunately, I have posted questions and received nothing but crickets as it appears you have. Hopefully, someone will get back to you.
... View more
09-18-2018
11:00 AM
Yeah, they sold Powerschool off to a competitor who took it way past what we had. One of the Java engineers talked the new company into letting him re-write it in Java. It looks pretty cool from the last time I looked at it. I remember that group back in the day. I still have my email address and am using 50GB of iCloud storage for the day my Mac goes down. I bought a 2016 laptop in 2017 and absolutely love it. All except the memory. In which they fixed in the newest releases of MacBook Pro. Dang company! 🙂 Dude, I got some stories of sticking with something. I have about 33 years of Unix experience starting back in the day with Sun Microsystems servers. I own a Sun Sparc Classic and a Sun Enterprise 250 workgroup server. Both of them will still boot. I wil let you know if anything pops up as I am going to do some Django development as a phase II project to manage users automatically. I figured out how to web scrape the Hue page using the older user list URL (/useradmin/users/) that doesn't use Javascript to load the user list (/hue/useradmin/users/). Although, Hue warns the older URL is old, it still works. I have not figured out yet how to load the user list with the new (current) URL of "/hue/useradmin/users/" but will get back to it later. It would be nice to have an API for Hue as Cloudera Manager has. My project was to auto-delete Cloudera and Hue users that have left the company. I have all the prototype functionality built and am about to bring it all together into a package. Thanks for all your help!
... View more
09-18-2018
10:40 AM
2005 was close to the time I left. You may have heard of a web based school administration product called Powerschool that Apple bought and sold later on. I was one of the main software engineers on the Powerschool product. U da man! That was it all along... Really weird things were happening though. Logs saying impalad was running ok and it was no where to be found. Oh well. Thanks for the hard work. Tell your manager I said to give you a raise. 🙂
... View more
09-18-2018
07:29 AM
Thanks for your response. Glad I could nudge you out of the dark and into the light. 🙂 And glad you're using a Mac! :-). I worked for Apple for a couple of years. I believe this is important but the Docker image I'm running is: cloudera-quickstart-vm-5.13.0-0-beta-docker I didn't bother mapping port 80 since I have Apache running on my Mac and it's using port 80. My stop it and try to see if there's something to it. Also, I mapped other ports but don't believe I mapped to 7180. Also, how did you run docker stop and then docker run without removing the container? I know if I don't get rid of the container, docker run will error out and say it already exists when it tries to create the container. ------------------------------------------------------------- So, I stopped Apache, used the port mappings you used and I seem to get a little more running. I can now see the "impalad" process but still get the error with port 21050 when I log in and cannot get to Hive through the interface. Interestingly, I see a listener port of "25010" which leads me to believe there's a typo in a configuration file??? I cannot execute the hdfs command you used successfully so I believe I've located the 2 hdfs processes by looking at a list of processes returned by filtering for "hdfs". One process, which I believe is the primary hdfs node has a process listing of: hdfs 276 4.3 3.7 1546804 77192 ? Sl 13:42 0:50 /usr/java/jdk1.7.0_67-cloudera/bin/java -Dproc_datanode -Xmx1000m -Dhadoop.log.dir=/var/log/hadoop-hdfs -Dhadoop.log.file=hadoop-hdfs-datanode-quickstart.cloudera.log -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/usr/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -server -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.datanode.DataNode What I believe is the secondary node has a listing of: hdfs 610 2.5 3.6 1516348 74976 ? Sl 13:43 0:28 /usr/java/jdk1.7.0_67-cloudera/bin/java -Dproc_secondarynamenode -Xmx1000m -Dhadoop.log.dir=/var/log/hadoop-hdfs -Dhadoop.log.file=hadoop-hdfs-secondarynamenode-quickstart.cloudera.log -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/usr/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode Mapping the process IDs with the ports from netstat -tulpne I get: tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN 497 3274353 276/java tcp 0 0 0.0.0.0:50090 0.0.0.0:* LISTEN 497 3274497 610/java These ports are very different than what seems like is needed and maybe why I cannot connect using the Hue interface??? I am currently searching the whole container to see if I can find where they are configured... Also, given that the error I get back when trying to access the hdfs path comes back with this error: 18/09/18 13:57:59 WARN ipc.Client: Failed to connect to server: quickstart.cloudera/172.17.0.2:8020: try once and fail. java.net.ConnectException: Connection refused I assume that somewhere there is a file that says to try to connect to port 8020 for the "hdfs dfs" command. There is no port listening on port 8020. This is why I'm looking for the configuration file that maps the 2 hdfs processes to 50020 and 50090. Am I thinking correctly? BTW, I'm using the following command from root (/) to find the configuration for the above port: find . -type f -exec grep -il '50090' {} \; I would assume there is a configuration file with that port set??? I have seen applications use defaults set inside the code if it doesn't find a configuration parameter in a config file. I changed my host /etc/hosts file to resolve quickstart.cloudera to localhost. I can now use quickstart.cloudera:8888 to load the Hue GUI. I get this error in the log: catalogd.quickstart.cloudera.impala.log.ERROR.20180918-132552.3090 Running on machine: quickstart.cloudera Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg E0918 13:25:52.987486 3090 logging.cc:126] stderr will be logged to this file. E0918 13:26:56.120744 3483 CatalogServiceCatalog.java:248] Error loading cache pools: Java exception follows: java.net.ConnectException: Call From quickstart.cloudera/172.17.0.2 to quickstart.cloudera:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731) at org.apache.hadoop.ipc.Client.call(Client.java:1508) at org.apache.hadoop.ipc.Client.call(Client.java:1441) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy17.listCachePools(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.listCachePools(ClientNamenodeProtocolTranslatorPB.java:1276) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:260) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy18.listCachePools(Unknown Source) at org.apache.hadoop.hdfs.protocol.CachePoolIterator.makeRequest(CachePoolIterator.java:55) at org.apache.hadoop.hdfs.protocol.CachePoolIterator.makeRequest(CachePoolIterator.java:33) at org.apache.hadoop.fs.BatchedRemoteIterator.makeRequest(BatchedRemoteIterator.java:77) at org.apache.hadoop.fs.BatchedRemoteIterator.makeRequestIfNeeded(BatchedRemoteIterator.java:85) at org.apache.hadoop.fs.BatchedRemoteIterator.hasNext(BatchedRemoteIterator.java:99) at org.apache.impala.catalog.CatalogServiceCatalog$CachePoolReader.run(CatalogServiceCatalog.java:243) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:648) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:744) at org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1557) at org.apache.hadoop.ipc.Client.call(Client.java:1480) ... 24 more You're way more experienced in this than I, so I would appreciate you letting me know what you think about my trouble-shooting approach and does it seem like I'm headed in the right direction? Also, I don't believe I've told you but when I do a "docker run" the first time and check the process list, there's a huge number of processes that come back compared to after I run "docker start"???
... View more
09-17-2018
11:49 AM
This is what I get when I try to restart: ---------------------------------------------------------------------------------------------------- service cloudera-scm-agent restart By default, the Cloudera QuickStart VM run Cloudera's Distribution including Apache Hadoop (CDH) under Linux's service and configuration management. If you wish to migrate to Cloudera Manager, you must run one of the following commands. To use Cloudera Express (free), run: sudo /home/cloudera/cloudera-manager --express This requires at least 8 GB of RAM and at least 2 virtual CPUs. To begin a 60-day trial of Cloudera Enterprise with advanced management features, run: sudo /home/cloudera/cloudera-manager --enterprise This requires at least 10 GB or RAM and at least 2 virtual CPUs. Be aware that after rebooting, it may take several minutes before Cloudera Manager has started all of the services it manages and is ready to accept connections from clients. ---------------------------------------------------------------------------------------------------- Now trying: /home/cloudera/cloudera-manager --express --force Because I tried the "--express" option alone and it said I needed at least 8GB. I believe the Docker container image allocates 4GB. That may be the whole problem??? The restart did not work as it closed out most of the processes that run when the container is run and started up with others and not near as many. This is the command I run to get the container going... docker run -dit --hostname=quickstart.cloudera --name "cloudera" --privileged=true -p 8888:8888 88ed37152d45 /usr/bin/docker-quickstart
... View more
09-17-2018
06:18 AM
According to the output logs: Using CATALINA_BASE: /var/lib/solr/tomcat-deployment Using CATALINA_HOME: /usr/lib/solr/../bigtop-tomcat Using CATALINA_TMPDIR: /var/lib/solr/ Using JRE_HOME: /usr/java/jdk1.7.0_67-cloudera Using CLASSPATH: /usr/lib/solr/../bigtop-tomcat/bin/bootstrap.jar Using CATALINA_PID: /var/run/solr/solr.pid Started Impala Catalog Server (catalogd) : [ OK ] Started Impala Server (impalad): [ OK ] Yes, impalad is running. But, according to "ps aux" it is not: ps aux | grep -i impala impala 2693 0.1 0.0 357552 612 ? Sl Sep14 6:24 /usr/lib/impala/sbin/statestored -log_dir=/var/log/impala -state_store_port=24000 impala 3077 0.3 3.0 1034932 62260 ? Sl Sep14 13:46 /usr/lib/impala/sbin/catalogd -log_dir=/var/log/impala There's no "impalad" process running. I assumed the log was correct and did not check. The " catalogd.quickstart.cloudera.impala.log.ERROR.20180914-185251.3077 " log contains: -------------------------------------------------------------------------- First error: -------------------------------------------------------------------------- Log file created at: 2018/09/14 18:52:51 Running on machine: quickstart.cloudera Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg E0914 18:52:51.962208 3077 logging.cc:126] stderr will be logged to this file. W0914 18:53:34.936743 3448 Client.java:886] Failed to connect to server: quickstart.cloudera/172.17.0.2:8020: try once and fail. Java exception follows: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:648) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:744) at org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1557) at org.apache.hadoop.ipc.Client.call(Client.java:1480) at org.apache.hadoop.ipc.Client.call(Client.java:1441) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy17.listCachePools(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.listCachePools(ClientNamenodeProtocolTranslatorPB.java:1276) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:260) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy18.listCachePools(Unknown Source) at org.apache.hadoop.hdfs.protocol.CachePoolIterator.makeRequest(CachePoolIterator.java:55) at org.apache.hadoop.hdfs.protocol.CachePoolIterator.makeRequest(CachePoolIterator.java:33) at org.apache.hadoop.fs.BatchedRemoteIterator.makeRequest(BatchedRemoteIterator.java:77) at org.apache.hadoop.fs.BatchedRemoteIterator.makeRequestIfNeeded(BatchedRemoteIterator.java:85) at org.apache.hadoop.fs.BatchedRemoteIterator.hasNext(BatchedRemoteIterator.java:99) at org.apache.impala.catalog.CatalogServiceCatalog$CachePoolReader.run(CatalogServiceCatalog.java:243) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) -------------------------------------------------------------------------- Last error: -------------------------------------------------------------------------- E0917 13:07:50.805730 3448 CatalogServiceCatalog.java:248] Error loading cache pools: Java exception follows: java.net.ConnectException: Call From quickstart.cloudera/172.17.0.2 to quickstart.cloudera:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.GeneratedConstructorAccessor5.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731) at org.apache.hadoop.ipc.Client.call(Client.java:1508) at org.apache.hadoop.ipc.Client.call(Client.java:1441) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy17.listCachePools(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.listCachePools(ClientNamenodeProtocolTranslatorPB.java:1276) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:260) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy18.listCachePools(Unknown Source) at org.apache.hadoop.hdfs.protocol.CachePoolIterator.makeRequest(CachePoolIterator.java:55) at org.apache.hadoop.hdfs.protocol.CachePoolIterator.makeRequest(CachePoolIterator.java:33) at org.apache.hadoop.fs.BatchedRemoteIterator.makeRequest(BatchedRemoteIterator.java:77) at org.apache.hadoop.fs.BatchedRemoteIterator.makeRequestIfNeeded(BatchedRemoteIterator.java:85) at org.apache.hadoop.fs.BatchedRemoteIterator.hasNext(BatchedRemoteIterator.java:99) at org.apache.impala.catalog.CatalogServiceCatalog$CachePoolReader.run(CatalogServiceCatalog.java:243) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:648) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:744) at org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1557) at org.apache.hadoop.ipc.Client.call(Client.java:1480) -------------------------------------------------------------------------- Also just found this in " impalad.quickstart.cloudera.impala.log.ERROR.20180914-185252.3129 ": -------------------------------------------------------------------------- Log file created at: 2018/09/14 18:52:52 Running on machine: quickstart.cloudera Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg E0914 18:52:52.005654 3129 logging.cc:126] stderr will be logged to this file. E0914 18:53:35.009699 3129 impala-server.cc:282] Could not read the root directory at hdfs://quickstart.cloudera:8020. Error was: Call From quickstart.cloudera/172.17.0.2 to quickstart.cloudera:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused E0914 18:53:35.010474 3129 impala-server.cc:285] Aborting Impala Server startup due to improper configuration. Impalad exiting. -------------------------------------------------------------------------- NOTE: This is the docker image without any modifications at all to the image. I would assume it was tested and worked when it was created??? Any ideas? Thanks!
... View more
09-14-2018
12:03 PM
Hi,
I have just successfully run a Docker container image of Cloudera. I can connect to the instance using Safari on the Mac. However, I get an error message that says it can't connect:
"Could not connect to quickstart.cloudera:21050 (code THRIFTTRANSPORT): TTransportException('Could not connect to quickstart.cloudera:21050',)"
Also, when I do a check configuration it comes up with a bunch of errors:
hadoop.hdfs_clusters.default.webhdfs_url
Current value: http://localhost:50070/webhdfs/v1 Failed to access filesystem root
OOZIE_EMAIL_SERVER
Email notifications is disabled for Workflows and Jobs as SMTP server is localhost.
Hive
Failed to access Hive warehouse: /user/hive/warehouse
HBase Browser
Failed to authenticate to HBase Thrift Server, check authentication configurations.
Impala
No available Impalad to send queries to.
Hadoop Security: Sentry Service
Failed to connect to Sentry API (version 1).
Hadoop Security: Sentry Service
Failed to connect to Sentry API (version 2).
Anyone have a direction they can point me on how to resolve all of these issues. I thought the Docker image would work out of the box.
Thanks!
... View more
Labels:
- Labels:
-
Cloudera Hue
-
Quickstart VM
09-10-2018
11:30 AM
BTW, I should have made this request in my previous post. Please DO NOT remove the /useradmin/users/ URL option until you have come up with an API or an easier way to scrape the web pages. Being that the message indicates that is the old way of getting the list of users and the new way is using Javascript to load it, if you remove the old way and we upgrade, using the old URL won't allow me to load the user list. So, can someone there make a note to not remove that old URL please!!!
... View more