Member since
04-19-2020
9
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
293 | 07-22-2020 06:17 PM |
10-29-2020
12:56 PM
Trying to build a python client (Azure Function App) to access Cloudera Manager API which is running as a part of Cloudera Public Cloud 7.2.
Using https://cloudera.github.io/cm_api/docs/python-client-swagger/
When using username/password approach (created workload user in Management Console and Synced to environment) getting redirection to sso page. Only when using the token as a cookie in the request header, which I intercepted from browser UI calls, I managed to get 200 response.
Is there any way to make it work with username/password
... View more
09-28-2020
01:56 PM
I managed to integrate Airflow with redis into Cloudera Manager. To run custom DAG, they need to be uploaded to the airflow dag folder on the node, where airflow scheduler and workers are dunning
... View more
07-22-2020
06:17 PM
Problem solved: the issue was related to topology.py which used python as a default interpreter which despite all env vars that are pointing to python3 was still resolved to python 2 so ended up overriding topology with path to python3
... View more
07-19-2020
09:25 PM
We are doing spark-submit from airflow (added it as a custom parcel into CDP 7.1) Airflow is built with python 3 however default python version on CDP is python2. As a result during spark-submit getting this issue: WARN net.ScriptBasedMapping: Exception running /etc/hadoop/conf.cloudera.yarn/topology.py 10.228.86.42
ExitCodeException exitCode=1: File "/opt/cloudera/parcels/Airflow-1.10.10-python3.7.7_1.2.3/lib/python3.7/site.py", line 177
file=sys.stderr)
^
SyntaxError: invalid syntax Added PYSPARK_DRIVER_PYTHON and PYSPARK_PYTHON to spark-defaults as well as spark-env.sh pointing to python3. Also added spark.yarn.appMasterEnv.PYTHONHASHSEED = 0 however the problem remains. As soon as python version is being changed to python3 on the workers (basically the only available python becomes python 3) spark-submit starts working. I was wondering if there is something I am missing. Thanks
... View more
Labels:
07-19-2020
03:58 PM
We are using CDP 7.1.1 with Zeppelin 0.8.2 and Cloudera Manager When running aggregation on parquet file select count(*) from ... we are getting java.sql.SQLException: Error while compiling statement: FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.tez.TezTask. /tmp/snappy-1.0.4.1-....-libsnappyjava.so: /tmp/snappy-1.0.4.1-...-libsnappyjava.so: failed to map segment from shared object: Operation not permitted I tried to add -Dorg.xerial.snappy.tempdir="<other folder>" as HIVE_OPTS as well as org.xerial.snappy.tempdir=<other folder>in the hive-site.xml in Cloudera Manager however it doesn't seem to have an effect. Was wondering what would be a fix for this issue. I understand that issue is caused by limitations on the execution inside of /tmp folder but I am not sure where should we provide alternate location for Snappy libs Thanks
... View more
Labels:
05-13-2020
09:33 PM
It is a bit of an old post however nothing has changed in the CDP 7.1. Still the same behavior. Is the parcel getting cached somewhere?
... View more
04-27-2020
08:42 PM
It is a bit of a late reply - haven't tried it myself yet but looks promising - https://blog.clairvoyantsoft.com/apache-airflow-csd-ac5b145d5e2d
... View more
04-20-2020
01:37 PM
Hi @StevenOD, I might have misunderstood the hosting detail of Management Console. As far as I understood it is going to be hosted at < https://console.cdp.cloudera.com> which is multi-tenant, Cloudera managed cloud resource that we don't have much control or visibility over. You are correct about my PoC being in public cloud however all the resources are in VPC, which is covered by the company policies in which case more control and visibility are available. Please correct me if I am wrong on that. My other question still stands regarding the alternatives for creating CDP in AWS environment apart from provisioning using Public Cloudera managed cloud. Thank you very much
... View more
04-19-2020
08:59 PM
Hi @StevenOD , I have similar question to @muslihuddin . I am trying to do a quick PoC with spinning up cloudera CDP Environment in AWS following this doc: https://community.cloudera.com/t5/Community-Articles/How-to-create-a-CDP-environment-in-AWS-with-minimal/ta-p/282916 however since Management Console is only in public cloud, which is not an option for my organisation, I am wondering if there is any other option available for trialing running CDP in AWS? Thank you
... View more