Member since
08-16-2016
642
Posts
131
Kudos Received
68
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3554 | 10-13-2017 09:42 PM | |
6530 | 09-14-2017 11:15 AM | |
3305 | 09-13-2017 10:35 PM | |
5273 | 09-13-2017 10:25 PM | |
5950 | 09-13-2017 10:05 PM |
07-25-2017
11:34 AM
Have you looked at the idle query timeout setting in Impala itself? There is the session level equivalent of QUERY_TIMEOUT_S that you can try from within your JDBC connection. https://www.cloudera.com/documentation/enterprise/5-7-x/topics/impala_timeouts.html
... View more
07-25-2017
09:57 AM
Do you have the HDFS Gateway installed on the same host that spark2-shell is running on?
... View more
07-21-2017
11:26 PM
Try curl -H "Content-Type: application/json" --upload-file deploymnet.json -u admin:admin 'http://scmhost:7180/api/v17/cm/deployment?deleteCurrentDeployment=true'
... View more
07-21-2017
11:17 PM
@Fawze Your other questions have been answers but I wanted to add this bit regarding: "spark streaming." Spark2 comes with Structure Streams which is the new version of Spark Streaming. Currently Cloudera doesn't support it due to view it as an experimental API. I haven't looked myself, but if it is, then you run the risk of building apps based on it that could break with each upgrade of Spark2. Just a word of caution. I am still in the testing phase but so far no issues with running Spark1 and Spark2 on the same cluster. I have the Spark History servers on different hosts but that is more to spread the load. They run on different . ports and the configuration work out of the box. As mentioned they are separate service with separate configs. I currently have the gateway on the same host.
... View more
07-21-2017
02:52 PM
That warning indicates that something is talking to CM without using SSL. Did you change all of the agent config files to use_tls=1? As for the truststore questions. First there is a keystore and a truststore. The keystore stores the key and certificate for a service. This is sensitive as it is the source of how a service identifies itself to another. The truststore just hold the signing certificate and is used by clients to trust any certs signed by the certs in it. The path /usr/lib/jvm/java-7-oracle-cloudera/jre/lib/security/cacerts looks similar to the location that you would store a system-wide truststore. I think that location is right and the name would be jssecacert or something similar. This means that all Java based program will use this by default without needing to tell the app or client of its location. Now you don't have to use it; you can create and use your own. And you can have as many as you want although each app, service, client can usually only be configured to use one at a time. Plus, since it is only storing the CA cert why not just have them all in one store to cut down the work. Note: with self-sign certs, the cert itself become the certificate signing or CA cert and must be put in the truststore.
... View more
07-21-2017
02:39 PM
@Fawze I don't collect specific metric, yet. I make an api call to get all Hive jobs between this time and that time (same for Impala) from... This data is then crunched to provide usage analysis for these specific types of jobs. /clusters/{clusterName}/services/{serviceName}/yarnApplications /clusters/{clusterName}/services/{serviceName}/impalaQueries
... View more
07-21-2017
02:35 PM
I would say Cloudera support if you have that for your cluster. They can then vet it against existing bug and patches backported to your version. They can also tell you if a bug exist, when it will be available and which version. And failing all of that they can open a new JIRA. You can open a JIRA account and create a ticket yourself, providing the CDH version and ask the community how to proceed. They should have some guidelines as well although I do not know them or have them handy.
... View more
07-21-2017
02:32 PM
1 Kudo
Based on some SO post, the exception is most likely related to some invalid JSON somewhere. This si the Spark History server though and I cannot think of any json files it would be using on a regular basis. On mine I see a redaction-rules.json. Are you using redaction? Oh wow, I think it was staring at us in the face. It is trying to read a specific application log which has invalid JSON characters. Read that file and put its output into a JSON validator to see what is invalid. I would save it somewhere so it can be review again if needed. Then remove it and try to run the job again. If it fails again, then something is causing it to create the invalid JSON in the application log.
... View more
07-21-2017
02:17 PM
1 Kudo
@MilesYao That may be. On that topic, I don't think anytime soon as Cloudera does not support many features that it does for Spark 1.6. I suspect that sometime post CDH 6 we will see Spark 2.x supplant Spark 1.x as the only version of Spark in CDH. Ah, I checked out HDP and see what you are getting at. It is really trivial on the difference. Cloudera asks you to put a file on the CM host and configure a separate parcel while HDP includes both Spark1 and Spark2 packages in the same repo.
... View more
07-20-2017
08:27 AM
I would say to add the internal IPs to the hosts file for the Datanodes, as it seems that they are communicating over it, and the external for the Namenode. You could possibly even try the internal for the Namenode if the internal IPs are reachable by the other cluster.
... View more