Member since
01-08-2018
133
Posts
31
Kudos Received
21
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
13388 | 07-18-2018 01:29 AM | |
2305 | 06-26-2018 06:21 AM | |
3929 | 06-26-2018 04:33 AM | |
2040 | 06-21-2018 07:48 AM | |
1476 | 05-04-2018 04:04 AM |
09-13-2018
07:35 AM
First of all, I have not tested CDH6 yet, so my reply will be based on CDH5 experience. I suppose the log you provided is from cloudera-scm-agent, correct? And this agent was installed before the renaming of hostname. In that case check "server_host" in "/etc/cloudera-scm-agent/config.ini" and change it to the new name.
... View more
08-24-2018
12:30 AM
1 Kudo
In that case, you can use HDFS ACL and grant mapred user read permissions to all existing and any future file under this directory, to mapred user. https://www.cloudera.com/documentation/enterprise/5-14-x/topics/cdh_sg_hdfs_ext_acls.html#concept_hdfs_extended_ACL_example
... View more
08-23-2018
03:57 AM
1 Kudo
This does not seem to be an issue with Cloudera platform, but with your DB. According to your logs, mysql is down. First notice, is that you are not using the default "/var/lib/mysql" datadir but "/var/lib/mysql2", if that is true then this directory should be owned by user:group "mysql:mysql". If not, then probably when you modified your /etc/my.cnf file, you made a typo error in datadir. If still this is not the case, then you might find interesting this post https://dba.stackexchange.com/questions/106085/cant-create-file-var-lib-mysql-user-lower-test regarding AppArmor.
... View more
08-23-2018
03:44 AM
If ownership of "/user/history" has not been set to "hdfs:supergroup" in purpose, then this is simply a configuration issue. This directory should be owned by mapred user. Normally, this directory is automatically created and configured by Cloudera Manager, If you are doing manual installation, then you can check in https://www.cloudera.com/documentation/enterprise/5-9-x/topics/cdh_ig_yarn_cluster_deploy.html#topic_11_4_9
... View more
08-23-2018
03:35 AM
1 Kudo
I think, what you need first is this https://www.cloudera.com/documentation/enterprise/5-10-x/topics/cm_mc_decomm_host.html If decommission is done successfully (all blocks are available across the remaining Datanodes according to the replication factor) then you can delete the nodes.
... View more
08-22-2018
07:54 AM
If I understand the question, you want to remove a node from your Cluster and install Apache Hadoop directly. No requirement to preserve data. You can got to "Hosts-> All Hosts" in Cloudera Manager, select the node and from Actions menu you can do sequentially: Stop Roles on Hosts Remove from Cluster Remove from Cloudera Manager Before you do the last step, stop and disable cloudera-scm-agent # systemctl stop cloudera-scm-agent # systemctl disable cloudera-scm-agent Then, you can uninstall cloudera-scm-agent from this node, and install the Apache Hadoop.
... View more
07-19-2018
03:41 AM
1 Kudo
I think what you need is the following hbase_config= { 'hbase_service_config_safety_valve': '<property><name>regionserver.global.memstore.upperLimit</name><value>0.15</value></property>' }
hbase.update_config(hbase_config) I have tested it and it works. There is no need to set "m" with get_config and then re-apply the whole configuration back. You only have to update the specific safety_valve and not all hbase config. The only catch is that you have to apply all hbase_service_config_safety_valve parameters. The same will happen with your approach because you are not updating only the "regionserver.global.memstore.upperLimit" but you are applying a value to safety valve. Of course you can write additional code, to parse the existing config of safety valve (the xml part) and add or update the regionserver.global.memstore.upperLimit but again change it to json format as per my example.
... View more
07-19-2018
02:18 AM
I suppose that you have followed Cloudera's instructions and created the temp user with the command below: mysql> grant all on *.* to 'temp'@'%' identified by 'temp' with grant option; Which is fine. In that case you should use the actual hostname instead of localhost. In mysql "%" does not always include "localhost".
... View more
07-18-2018
06:30 AM
Can you try mysql -h localhost -u temp -p just to verify that mysql server accepts connections? Also, can you check if SELinux is enabled? getenforce If the output is "1" then you should disable it. Temp change setenforce 0
... View more
07-18-2018
06:28 AM
You need the truststore. Check this link https://www.cloudera.com/documentation/enterprise/5-11-x/topics/sg_add_root_ca_explicit_trust.html#id_for_this_topic where it is documented.
... View more
07-18-2018
06:25 AM
My mistake. Cloudera works with json not xml. Anyhow, to upload complete cluster config you can do it by following https://gist.github.com/gwenshap/7044525 but this is for serious cases. On the other hand, you can push only HBase configuration with the previous links I send you. You just have to prepare config in json format.
... View more
07-18-2018
04:49 AM
From command line I don't know if there is a way to upload xml for a specific service. You can do it for the cluster. But you should check https://cloudera.github.io/cm_api/apidocs/v19/path__clusters_-clusterName-_services_-serviceName-_config.html and https://cloudera.github.io/cm_api/apidocs/v19/path__clusters_-clusterName-_services_-serviceName-_roles_-roleName-_config.html You can also modify parameters by using PUT instead of GET
... View more
07-18-2018
01:44 AM
The error is that user impala has not write permissions to "/user/cloudera" directory and its contents (physician.csv). Write permissions are required, because impala will move this file from "/user/cloudera/physician.csv" to "/user/hive/warehouse/test.db/tablename/physician.csv"
... View more
07-18-2018
01:29 AM
1 Kudo
According to the error, it is looking for java 7 installed by cloudera. You should define JAVA_HOME={path_to_your_jdk8_installation} in bashrc.
... View more
07-18-2018
01:26 AM
Most probably, your system does not accept the certificates of repo1.maven..... You should check it and if needed to import certificates of the respective CA.
... View more
07-18-2018
01:22 AM
You can try using the API https://www.cloudera.com/documentation/enterprise/5-14-x/topics/cm_intro_api.html https://cloudera.github.io/cm_api/apidocs/v19/index.html
... View more
07-18-2018
01:19 AM
I cannot see any clear indication in the logs, other than connection to mysql failed for some reason. mysql java connector seem to be in place. Have you installed mysql with default options? Port 3306? If for example, you have used 3307, then you should also specify "-P 3307". I recommend to specify it, even if you are using the default port. Have you granted permissions to "temp" user according to instructions? PS: since your mysql is on localhost, you do not need to define another user and you can use root /usr/share/cmf/schema/scm_prepare_database.sh mysql -h localhost -u root -p --verbose scmdb scmuser
... View more
07-13-2018
12:20 AM
Regarding python 2. If your hive server is configured with SSL, then you should consider installing "sasl" package in python. As about python3, although this is a python question not hive related, usually the issue is on the previous lines, e.g. quotes or parentheses that do not terminate.
... View more
07-12-2018
06:53 AM
Sentry is used as an authorization tool. Define rules what users are allowed to do. Sentry is not a tool to edit data, so no redaction can be done. If you meant "restrict" and that's a typo error, then the answer is positive, you can check https://www.cloudera.com/documentation/enterprise/5-14-x/topics/cm_sg_sentry_service.html#hive_impala_privilege_model The easiest way is to use HUE. If you need to do it through Beeline, then check syntax in https://www.cloudera.com/documentation/enterprise/5-14-x/topics/sg_hive_sql.html
... View more
07-12-2018
06:26 AM
If it doesn't work, then probably the e-mail in action will not work. Try to send a mail from console: mail user@mail.address -s -s test_subject << EOF This is a test e-mail EOF There are various cases to send an e-mail. If you need an e-mail if an error is encountered, or action takes too much time, then you have to enable SLAs on this action and define the recipient. If you need an e-mail that the workflow executed successfully, then add a mail action just before the end of the workflow. If any previous action fails, the e-mail action will not executed. Unless you have modified the kill in one of your actions of course, and lead it to this e-mail action. You have multiple options to cover multiple scenarios.
... View more
07-12-2018
03:38 AM
You mean how the user can submit the job from HUE? If you save the file in HDFS as "workflow.xml", go to File browser using HUE. You will notice that if you select the check button of this file, you a "Submit" action button will appear. So the user can just hit it.
... View more
07-12-2018
01:57 AM
As you said, when it's executed, we don't know on which yarn node the command will be executed. So, with this storage bay mounted on each node, it doesn't matter on which node it's executed (I think :p) Correct Regarding the rest. First of all you don't have to be a scala developer to schedule a script in oozie 🙂 Your command should be "./runConsolidator.sh". Important is that your script has execute permissions and you define it in "Files". How it works: this shell action will be a YARN job, so YARN will create a temp folder e.g. "/yar/nm/some_id/another_id". All files defined in "Files" of this action, will automatically downloaded into this directory. This directory will be your working directory, so you should run your command with "./" in front, since by default, "./" is not defined in PATH. NOTE: If your script is using jar files etc. then you should define all of them in "Files", so they will copied to the working directory. I suggest to proceed with this approach. Setting the xml will mess things and you need some experience to do it and avoid mistakes. Once you create a working job from HUE, you can export the xml and start playing.
... View more
07-03-2018
05:09 AM
Hi, sorry if the reply is not very clear. It is written in hurry. I will try to expand it later. First of all HUE provides a very good interface to write oozie jobs and hardly you will need to write the job xml your own. You have a shell script with spark-submit. Spark submit will execute something (a jar or python file). When you define a "shell action", all files (used as parameters) should exist on the local filesystem (not in HDFS). If for example in a shel action you try "cat /etc/hosts" you will get the /etc/hosts file of the YARN node, where this shell action will be executed. If you have a file in HDFS (e.g. my_hosts) and you define it as "File" in shell action then oozie will download this file automatically into the working directory. Working directory is a random directory in YARN node's filesystem, which leaves while this yarn job is being executed. So, if you use the command "cat ./my_hosts", then wou will get the contents of the "my_hosts" which is downloaded to the working directory. In general is not very good idea to work with files on Slave Nodes, because you don't know on which yarn node this command will be executed each time. Unless you are sure tha you control it and you have deployed all required files to all nodes. Of course we are not discussing about temporary files that you may create during the execution, but for files with like metadata or configuration or files with results that you want to use after. IMHO it is always better to have these files in HDFS and send the results back to HDFS, so they will be easily accessible by other actions.
... View more
06-29-2018
03:10 AM
Just to understand, you don't want the spark job to use these files from HDFS but from your local system. If that is the case, then you can create a shell action as you have mentioned. First of all, put all required files on HDFS. Then define these files in the shell action. Oozie will automatically download these files to the working directory of the node that job will be executed. You don't have to manually distribute anything in advance to the nodes, oozie will take care of it, you just have to define these files in the job.
... View more
06-26-2018
06:27 AM
If you have enabled "HDFS-Sentry synchronization" then your setfacl actions will have no impact. Sentry rules are translated to ACL. You should use either HUE (Security/Sentry Roles) to fix the group or connect to beeline and use the grant/revoke commands.
... View more
06-26-2018
06:21 AM
There is an issue with the hostname you have configured on the host. Can you make sure that "hostname -f" resolves to a valid FQDN?
... View more
06-26-2018
06:20 AM
Have you run "Create Root Directory" from HBase's available actions in Cloudera Manager? This will create the "/hbase" (default value) directory in hdfs that will belong to user hbase and group hbase. If it is created then check permissions sudo -u hdfs hdfs dfs -ls / If not, create it (you can do it using Cloudera Manager as mentioned above): sudo -u hdfs hdfs dfs -mkdir /hbase sudo -u hdfs hdfs dfs -chown hbase:hbase /hbase
... View more
06-26-2018
04:33 AM
Ok, from the log it is obvioues that issue for spark is the old jdk. When you tried to upgrade java have you defined the java home in "/etc/default/cloudera-scm-server" e.g.: export JAVA_HOME="/usr/lib/jvm/java-8-oracle/" Can you send the relevant "/var/log/cloudera-scm-server/cloudera-scm-server.out" ?
... View more
06-25-2018
06:41 AM
My CDH version is 5.14 but there is no spark2 parcel for 5.14, there are parcels for 5.13 and 5.12. Is this the problem I faced with? This is not an issue. Spark is built on CDH5.13 but it works fine with CDH5.14. Check compatibility notes: https://www.cloudera.com/documentation/spark2/latest/topics/spark2_requirements.html#cdh_versions According to the screenshot, the procedure failed while it was distributing configuration to namenode. Can you check the "stdout" and "stderr" output? You can copy it here so we can take a look.
... View more
06-21-2018
07:48 AM
You should not worry for compatibility between KTS and CDH If you check https://www.cloudera.com/documentation/enterprise/latest/topics/encryption_ref_arch.html#concept_npk_rxh_1v CDH connects to KMS. KMS will connect to KTS So you have to check whether the KMS which is compatible to KTS3.8, is compatible with CDH5.14.2.
... View more