Member since
08-17-2018
39
Posts
3
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5977 | 08-28-2018 07:24 AM |
08-16-2019
06:44 AM
Hi Chelsea, I see memory being held, I believe in the CM Impala query interface along with a supposedly running query. I assume that if I see a query still saying it is executing that although it's just waiting to send the next page of results that memory is being held for sure. First, the query_timeout_s parameter has not been set, so if it's 10 minutes or 5 minutes like what our current configuration says, it's still not working: # Hue will try to close the Impala query when the user leaves the editor page. # This will free all the query resources in Impala, but also make its results inaccessible. ## close_queries=true # If > 0, the query will be timed out (i.e. cancelled) if Impala does not do any work # (compute or send back results) for that query within QUERY_TIMEOUT_S seconds. ## query_timeout_s=600 # If > 0, the session will be timed out (i.e. cancelled) if Impala does not do any work # (compute or send back results) for that session within QUERY_TIMEOUT_S seconds (default 1 hour). ## session_timeout_s=3600 This parameter is having no effect, correct? Whatever settings are being used are not working as I can see many related settings but none of them are set to allow a query to be active for 15 hours. Also, what about the # close_queries=true option? Will that do what we need? However, I have a question about this parameter... Will this kill the query and release the results if we set it to an hour as we need it to be? # Users will automatically be logged out after 'n' seconds of inactivity. # A negative number means that idle sessions will not be timed out. idle_session_timeout=-1 If not, any parameters that we haven't talked about that could stop this long retention period of resources? BTW, I could not find this command. Where should I look? build/env/bin/hue close_queries --help I could write a script theoretically that would check query duration to see if any are long running and kill them if need be. I know Hue has no API as I have experienced. I wrote a Python app that took a list of users and removed them automatically from CM and Hue. I had to use the Requests module in Python to load the existing users and bounce that off the users to be deleted. Then, create a POST request that deleted the existing users that have left the company. Quite painful but fun... 🙂 Just so you guys know, I'm here on this forum because all the documentation I read says the 2 parameters we have set should close the queries and return the resources after the timeout. Neither one of the 2 works. So documentation is not accurate or something else is wrong. Thanks!
... View more
08-15-2019
12:13 PM
I'm going to take a poke at this and hope I'm not wasting your time... This looks like a good start: https://kudu.apache.org/docs/developing.html I have done minimal Python development using PySpark and Kudu. It's not too bad...
... View more
08-15-2019
12:11 PM
I'm going to take a poke at this and hope I'm not wasting your time... This looks like a good start: https://kudu.apache.org/docs/developing.html I have done minimal Python development using PySpark and Kudu. It's not too bad...
... View more
08-15-2019
12:04 PM
Hi,
This is a continuation of a previous post titled "Impala Queries Executing long time" in 2017 in which the Cloudera employee explained why query would appear running but not actually be running. In summary, he said the paging function in Hue will leave the query open and appear to be running.
He also said to fix this you need to set some parameters that will force a timeout to occur:
"You can also set idle query timeout and idle session timeout in impala advance snippet to force timeout for queries running from hue."
Unfortunately, this is not true as the state of the query is not in an exact state that it requires for the timeout to occur.
In our case, we have the following settings in Impala:
-idle_session_timeout=3600 -idle_query_timeout=3600
This is set in a field with the label:
"Impala Daemon Command Line Argument Advanced Configuration Snippet (Safety Valve)"
I just killed a query that appeared to be running for 15 hours. The interesting thing about that query is, it had a LIMIT 50 clause.
I can't imagine 50 records taking 15 hours in most scenarios...
We regularly have queries exceed our verbal agreement of 1 hour before we kill the job.
The question is rather obvious but, since I know these settings do not work in this scenario and maybe something similar, not sure, what will stop Hue from holding the resources for so long?
Thanks!
... View more
Labels:
- Labels:
-
Apache Impala
-
Cloudera Hue
07-22-2019
09:05 AM
UPDATE: As of Spark 2.4, the context.py code has been changed to require an authentication token. However, I'm not sure how to set this token as I have looked in Cloudera Manager, on the web and in the files and cannot find it anywhere. Will someone from Cloudera please help us setup this requirement as the code clearly requires it? Thanks!
... View more
07-19-2019
08:51 AM
Hi, I have been researching for a few days on why we cannot execute any Python code in the PySpark interface inside Hue. PySpark command: from pyspark import SparkContext Error Message: stdout:
stderr:
WARNING: User-defined SPARK_HOME (/opt/cloudera/parcels/SPARK2-2.4.0.cloudera2-1.cdh5.13.3.p0.1041012/lib/spark2) overrides detected (/opt/cloudera/parcels/SPARK2/lib/spark2/).
WARNING: Running spark-class from user-defined location.
19/07/19 07:59:45 WARN spark.SparkConf: The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead.
19/07/19 07:59:46 WARN spark.SparkConf: The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead.
19/07/19 07:59:46 WARN rsc.RSCConf: Your hostname, usbda04.unix.rgbk.com, resolves to a loopback address, but we couldn't find any external IP address!
19/07/19 07:59:46 WARN rsc.RSCConf: Set livy.rsc.rpc.server.address if you need to bind to another address.
19/07/19 07:59:49 WARN util.Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
19/07/19 07:59:49 WARN util.Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
YARN Diagnostics:
sys.exit(main())
File "/tmp/2588781570290623481", line 589, in main
sc = SparkContext(jsc=jsc, gateway=gateway, conf=conf)
File "/opt/cloudera/parcels/SPARK2-2.4.0.cloudera2-1.cdh5.13.3.p0.1041012/lib/spark2/python/lib/pyspark.zip/pyspark/context.py", line 121, in __init__
ValueError: You are trying to pass an insecure Py4j gateway to Spark. This is not allowed as it is a security risk.
YARN Diagnostics: We recently update Spark from 2.3 to 2.4. However, I am not sure if it was working with 2.3. We also recently activated Kerberos. I am not sure what this message is saying but my guess is a configuration is setup to send requests to a specific server (the gateway) and it's not SSL encrypted on the target server, so there is a rule setup to avoid sending requests to non-SSL services? If this is the case, it's not important that the traffic be encrypted as this is a development server. Any theories would be most helpful as I can investigate. The problem is, there is no one that has had this problem according to the research I have done. One other thing to note, which may no bearing on it at all, but I cannot execute pyspark from the command line as I get what appears to be a very old (2016) bug: $ pyspark
Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
Traceback (most recent call last):
File "/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/lib/spark/python/pyspark/shell.py", line 30, in <module>
import pyspark
File "/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/lib/spark/python/pyspark/__init__.py", line 41, in <module>
from pyspark.context import SparkContext
File "/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/lib/spark/python/pyspark/context.py", line 33, in <module>
from pyspark.java_gateway import launch_gateway
File "/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/lib/spark/python/pyspark/java_gateway.py", line 31, in <module>
from py4j.java_gateway import java_import, JavaGateway, GatewayClient
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 656, in _load_unlocked
File "<frozen importlib._bootstrap>", line 626, in _load_backward_compatible
File "/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 18, in <module>
File "/opt/cloudera/parcels/Anaconda-5.1.0.1/lib/python3.6/pydoc.py", line 59, in <module>
import inspect
File "/opt/cloudera/parcels/Anaconda-5.1.0.1/lib/python3.6/inspect.py", line 361, in <module>
Attribute = namedtuple('Attribute', 'name kind defining_class object')
File "/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/lib/spark/python/pyspark/serializers.py", line 381, in namedtuple
cls = _old_namedtuple(*args, **kwargs)
TypeError: namedtuple() missing 3 required keyword-only arguments: 'verbose', 'rename', and 'module' Also, this article comes the closest to the possible issue we're having: https://community.cloudera.com/t5/Web-UI-Hue-Beeswax/Issue-with-PySpark-and-Hue/m-p/52792#M2162 However, I don't know how to check to see if Kerberos is setup and setup properly for this purpose. Any guidance is appreciated on this as well. Any ideas/help would be much appreciated! Thanks!
... View more
Labels:
- Labels:
-
Apache Spark
09-18-2018
11:00 AM
Yeah, they sold Powerschool off to a competitor who took it way past what we had. One of the Java engineers talked the new company into letting him re-write it in Java. It looks pretty cool from the last time I looked at it. I remember that group back in the day. I still have my email address and am using 50GB of iCloud storage for the day my Mac goes down. I bought a 2016 laptop in 2017 and absolutely love it. All except the memory. In which they fixed in the newest releases of MacBook Pro. Dang company! 🙂 Dude, I got some stories of sticking with something. I have about 33 years of Unix experience starting back in the day with Sun Microsystems servers. I own a Sun Sparc Classic and a Sun Enterprise 250 workgroup server. Both of them will still boot. I wil let you know if anything pops up as I am going to do some Django development as a phase II project to manage users automatically. I figured out how to web scrape the Hue page using the older user list URL (/useradmin/users/) that doesn't use Javascript to load the user list (/hue/useradmin/users/). Although, Hue warns the older URL is old, it still works. I have not figured out yet how to load the user list with the new (current) URL of "/hue/useradmin/users/" but will get back to it later. It would be nice to have an API for Hue as Cloudera Manager has. My project was to auto-delete Cloudera and Hue users that have left the company. I have all the prototype functionality built and am about to bring it all together into a package. Thanks for all your help!
... View more
09-18-2018
10:40 AM
2005 was close to the time I left. You may have heard of a web based school administration product called Powerschool that Apple bought and sold later on. I was one of the main software engineers on the Powerschool product. U da man! That was it all along... Really weird things were happening though. Logs saying impalad was running ok and it was no where to be found. Oh well. Thanks for the hard work. Tell your manager I said to give you a raise. 🙂
... View more
09-18-2018
07:29 AM
Thanks for your response. Glad I could nudge you out of the dark and into the light. 🙂 And glad you're using a Mac! :-). I worked for Apple for a couple of years. I believe this is important but the Docker image I'm running is: cloudera-quickstart-vm-5.13.0-0-beta-docker I didn't bother mapping port 80 since I have Apache running on my Mac and it's using port 80. My stop it and try to see if there's something to it. Also, I mapped other ports but don't believe I mapped to 7180. Also, how did you run docker stop and then docker run without removing the container? I know if I don't get rid of the container, docker run will error out and say it already exists when it tries to create the container. ------------------------------------------------------------- So, I stopped Apache, used the port mappings you used and I seem to get a little more running. I can now see the "impalad" process but still get the error with port 21050 when I log in and cannot get to Hive through the interface. Interestingly, I see a listener port of "25010" which leads me to believe there's a typo in a configuration file??? I cannot execute the hdfs command you used successfully so I believe I've located the 2 hdfs processes by looking at a list of processes returned by filtering for "hdfs". One process, which I believe is the primary hdfs node has a process listing of: hdfs 276 4.3 3.7 1546804 77192 ? Sl 13:42 0:50 /usr/java/jdk1.7.0_67-cloudera/bin/java -Dproc_datanode -Xmx1000m -Dhadoop.log.dir=/var/log/hadoop-hdfs -Dhadoop.log.file=hadoop-hdfs-datanode-quickstart.cloudera.log -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/usr/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -server -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.datanode.DataNode What I believe is the secondary node has a listing of: hdfs 610 2.5 3.6 1516348 74976 ? Sl 13:43 0:28 /usr/java/jdk1.7.0_67-cloudera/bin/java -Dproc_secondarynamenode -Xmx1000m -Dhadoop.log.dir=/var/log/hadoop-hdfs -Dhadoop.log.file=hadoop-hdfs-secondarynamenode-quickstart.cloudera.log -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/usr/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode Mapping the process IDs with the ports from netstat -tulpne I get: tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN 497 3274353 276/java tcp 0 0 0.0.0.0:50090 0.0.0.0:* LISTEN 497 3274497 610/java These ports are very different than what seems like is needed and maybe why I cannot connect using the Hue interface??? I am currently searching the whole container to see if I can find where they are configured... Also, given that the error I get back when trying to access the hdfs path comes back with this error: 18/09/18 13:57:59 WARN ipc.Client: Failed to connect to server: quickstart.cloudera/172.17.0.2:8020: try once and fail. java.net.ConnectException: Connection refused I assume that somewhere there is a file that says to try to connect to port 8020 for the "hdfs dfs" command. There is no port listening on port 8020. This is why I'm looking for the configuration file that maps the 2 hdfs processes to 50020 and 50090. Am I thinking correctly? BTW, I'm using the following command from root (/) to find the configuration for the above port: find . -type f -exec grep -il '50090' {} \; I would assume there is a configuration file with that port set??? I have seen applications use defaults set inside the code if it doesn't find a configuration parameter in a config file. I changed my host /etc/hosts file to resolve quickstart.cloudera to localhost. I can now use quickstart.cloudera:8888 to load the Hue GUI. I get this error in the log: catalogd.quickstart.cloudera.impala.log.ERROR.20180918-132552.3090 Running on machine: quickstart.cloudera Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg E0918 13:25:52.987486 3090 logging.cc:126] stderr will be logged to this file. E0918 13:26:56.120744 3483 CatalogServiceCatalog.java:248] Error loading cache pools: Java exception follows: java.net.ConnectException: Call From quickstart.cloudera/172.17.0.2 to quickstart.cloudera:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731) at org.apache.hadoop.ipc.Client.call(Client.java:1508) at org.apache.hadoop.ipc.Client.call(Client.java:1441) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy17.listCachePools(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.listCachePools(ClientNamenodeProtocolTranslatorPB.java:1276) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:260) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy18.listCachePools(Unknown Source) at org.apache.hadoop.hdfs.protocol.CachePoolIterator.makeRequest(CachePoolIterator.java:55) at org.apache.hadoop.hdfs.protocol.CachePoolIterator.makeRequest(CachePoolIterator.java:33) at org.apache.hadoop.fs.BatchedRemoteIterator.makeRequest(BatchedRemoteIterator.java:77) at org.apache.hadoop.fs.BatchedRemoteIterator.makeRequestIfNeeded(BatchedRemoteIterator.java:85) at org.apache.hadoop.fs.BatchedRemoteIterator.hasNext(BatchedRemoteIterator.java:99) at org.apache.impala.catalog.CatalogServiceCatalog$CachePoolReader.run(CatalogServiceCatalog.java:243) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:648) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:744) at org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1557) at org.apache.hadoop.ipc.Client.call(Client.java:1480) ... 24 more You're way more experienced in this than I, so I would appreciate you letting me know what you think about my trouble-shooting approach and does it seem like I'm headed in the right direction? Also, I don't believe I've told you but when I do a "docker run" the first time and check the process list, there's a huge number of processes that come back compared to after I run "docker start"???
... View more
09-17-2018
11:49 AM
This is what I get when I try to restart: ---------------------------------------------------------------------------------------------------- service cloudera-scm-agent restart By default, the Cloudera QuickStart VM run Cloudera's Distribution including Apache Hadoop (CDH) under Linux's service and configuration management. If you wish to migrate to Cloudera Manager, you must run one of the following commands. To use Cloudera Express (free), run: sudo /home/cloudera/cloudera-manager --express This requires at least 8 GB of RAM and at least 2 virtual CPUs. To begin a 60-day trial of Cloudera Enterprise with advanced management features, run: sudo /home/cloudera/cloudera-manager --enterprise This requires at least 10 GB or RAM and at least 2 virtual CPUs. Be aware that after rebooting, it may take several minutes before Cloudera Manager has started all of the services it manages and is ready to accept connections from clients. ---------------------------------------------------------------------------------------------------- Now trying: /home/cloudera/cloudera-manager --express --force Because I tried the "--express" option alone and it said I needed at least 8GB. I believe the Docker container image allocates 4GB. That may be the whole problem??? The restart did not work as it closed out most of the processes that run when the container is run and started up with others and not near as many. This is the command I run to get the container going... docker run -dit --hostname=quickstart.cloudera --name "cloudera" --privileged=true -p 8888:8888 88ed37152d45 /usr/bin/docker-quickstart
... View more
- « Previous
-
- 1
- 2
- Next »