Member since
08-16-2016
642
Posts
131
Kudos Received
68
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3432 | 10-13-2017 09:42 PM | |
6186 | 09-14-2017 11:15 AM | |
3177 | 09-13-2017 10:35 PM | |
5100 | 09-13-2017 10:25 PM | |
5733 | 09-13-2017 10:05 PM |
07-31-2017
04:53 PM
impala-shell -i haproxy1:21000 -k --ssl Are you using the FQDN in the impala-shell command? i.e. impala-shell -i haproxy.company.local -k -ssl Is there an ssl certificate for the HAProxy and is it configured to use it. Is the CA cert for it in the PEM file that Impala is configured to use?
... View more
07-31-2017
04:03 PM
That shouldn't matter. I am using an ELB that is completely separate from the CDH cluster. Did you specify the FQDN in that setting and does the principal contain the FQDN?
... View more
07-31-2017
03:08 PM
After making this change did you Generate Missing Credential in the CM Security windows or manually create the account and SPN. I haven't done Impala but for HS2, after adding the LB info in the Hive configs it through a configuration warning that credentials were missing. I generated them, the warning disappeared, and the LB worked.
... View more
07-31-2017
11:33 AM
1 Kudo
I can't seem to find anything but I thought you could change the prefix. I feel sure you can for MR jobs, but not sure for Hive. But if it is a MR property you could set that in your Hive session. The other thing to talk about here is that *_copy_1 is part of the Hive code for dynamic partitions. It checks before hand if 0000_0 already exist, possible from another reducer or another Hive process. It then appends _copy_# to protect the data. This will remain regardless of the prefix. So in theory, even if you went down to the millisecond, you could end up with identical files with the same name. Changing the prefix should help your case though, so try finding something on changing the output file prefix.
... View more
07-31-2017
10:57 AM
Look for the --netrc switch for your curl command. You can use this file to pass the username and password to the command. This keeps it out of the ps output and history but the file security must be maintained. The format should be the below but check man if needed. machine host.domain.com login myself password secret
... View more
07-28-2017
07:12 PM
The reason the first query works is because it does not need any MR or Spark jobs to run. The HS2 or Hive client just read the data directly. The second query requires MR or Spark jobs to be ran. This is key to remember when testing or troubleshooting the cluster. Are you able to run Spark jobs out side of Hive? Try the below command but swap out to your jar version. spark-submit --class org.apache.spark.examples.SparkPi --master yarn --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 /opt/cloudera/parcels/SPARK/lib/spark/examples/jars/spark-examples_*.jar Also access the Spark History server to get to the driver and executor logs to try to get more details on the failure.
... View more
07-28-2017
07:09 PM
Yes. It may cause issues depending on how you are using it when ingestion data into it. What are you trying to do? I wonder if a Hive view would be better then a separate table.
... View more
07-28-2017
07:07 PM
2 Kudos
I haven't set HA Proxy up for Impala, but I think you need a service principal for impala/<HAProxyHost>@REALM.COM in your KDC. The error is that the server is not found in the Kerberos database.
... View more
07-28-2017
12:21 PM
In the Spark2 configs, ensure that the Hive service is enabled. This will include the Hive client configs for the Spark2 service. This will allow the SparkSession created by spark2-shell to have Hive support for the HMS on the cluster. I haven't tested actual Spark2 applications but with the above setup it should be as simple as using the .enableHiveSupport in the SparkSession builder method. Outside of that you would probably need to include the hive-site.xml or Hive HMS settings in the Spark Context configuration object and then us .enableHiveSupport.
... View more
07-28-2017
08:03 AM
2 Kudos
I haven't done this yet but it should do the trick. You need to update the alternatives to make Spark2 the default. This will make it the default across the board and not just for Livy. So make sure you are ready for that. https://www.cloudera.com/documentation/spark2/latest/topics/spark2_admin.html
... View more