Member since
01-08-2018
133
Posts
31
Kudos Received
21
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
11750 | 07-18-2018 01:29 AM | |
2148 | 06-26-2018 06:21 AM | |
3749 | 06-26-2018 04:33 AM | |
1933 | 06-21-2018 07:48 AM | |
1365 | 05-04-2018 04:04 AM |
04-17-2018
09:14 AM
Sorry, I thought you were asking for major upgrade. For minor upgrades of course you can do a rolling upgrade. I have personally tested it. I cannot find it know because I am replying from my phone, but I think there is a comment in documentation that the mandatory is to have the same major version, it does not mention minor. But in any case I have done it several times in production environment and works without any issue.
... View more
04-17-2018
08:30 AM
No you cannot, according to third bullet (from https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_cm_upgrading_to_jdk8.html) Warning:
* Cloudera does not support upgrading to JDK 1.8 while upgrading to Cloudera Manager 5.3 or higher. The Cloudera Manager Server must be upgraded to 5.3 or higher before you start.
* Cloudera does not support upgrading to JDK 1.8 while upgrading a cluster to CDH 5.3 or higher. The cluster must be running CDH 5.3 or higher before you start.
* Cloudera does not support a rolling upgrade to JDK 1.8. You must shut down the entire cluster.
* If you are upgrading from a lower major version of the JDK to JDK 1.8 or from JDK 1.6 to JDK 1.7, and you are using AES-256 bit encryption, you must install new encryption policy files. (In a Cloudera Manager deployment, you automatically install the policy files; for unmanaged deployments, install them manually.) See Using AES-256 Encryption.
For both managed and unmanaged deployments, you must also ensure that the Java Truststores are retained during the upgrade. (See Recommended Keystore and Truststore Configuration.)
... View more
04-17-2018
08:27 AM
1 Kudo
I use the same command and have no issues. According to logs: Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/attempt_1523546159827_0013_r_000000_0/map_0.out So, I would guess that you csv is too big and when the reducer tries to load it, there is no sufficient space in local dirs of YARN nodemanager. Can you try set more reducers by using : --reducers 4 or more (based on your partitions and the csv size). You can also set more mappers, but based on log the reducer is suffering. More details: https://www.cloudera.com/documentation/enterprise/5-13-x/topics/search_mapreduceindexertool.html#concept_pjs_3sd_3v
... View more
04-17-2018
08:07 AM
You should not delete anything under /opt/cloudera and /var/lib directories. If the contents of these directories are too high for your partitions, then you should consider extending them. There is an exception in /var/lib/ but again you should not delete manually. The only place you can delete files without issues is "/var/log/..." but this is a temporarily solution. The "proper" way is to change "Max Log Size" and "Maximum Log File Backups" in Cloudera Manager, for each service is running on this machine. Edit: I started writing before I see the reply from @saranvisa. I agree with this.
... View more
04-17-2018
08:03 AM
1 Kudo
First of all, you have to stop Navigator. Then it depends on what you decide. The easiest approach (not requiring changes in configuration) is to move the directory and create a link: # mv /var/lib/cloudera-scm-navigator/solr /data1/
# ln -s /data1/solr /var/lib/cloudera-scm-navigator/solr Although I would prefer to move the whole navigator directory: # mv /var/lib/cloudera-scm-navigator/ data1/
# ln -s /data1/cloudera-scm-navigator /var/lib/cloudera-scm-navigator The last approach is again to move the directory without link # mv /var/lib/cloudera-scm-navigator/ data1/ and change Navigator Metadata Server Storage Dir nav.data.dir in Cloudera Manager to /data1/cloudera-scm-navigator Hope, that helps. As mentioned above, I prefer the second approach, because all directory is in one partition plus, someone new in your team won't have to check CM for the new location.
... View more
04-17-2018
05:18 AM
1 Kudo
Great news. I came across the following which is very helpfull to me and thought I should share it. https://www.cloudera.com/documentation/kafka/latest/topics/kafka_new_features.html#xd_583c10bfdbd326ba-590cb1d1-149e9ca9886--6fc9__kafka_new_features_300 In few words, Cloudera Kafka 3 supports wildcard in TOPIC and CONSUMERGROUPS with CDH 5.14.1.
... View more
04-17-2018
04:55 AM
I believe it will help if you can create additional areas under http://community.cloudera.com/t5/Configuring-and-Managing-the/ct-p/ConfiguringPlatform for CDS, CDK etc. Currently there are only the following three: Cloudera Manager Cloudera Director CDH
... View more
04-17-2018
04:50 AM
I see that repositories have been updated and CDS 2.3 has been released.
... View more
04-17-2018
02:31 AM
Based on your prompt "root@ip-172-16-10-10" I see that you have defined a hostname "ip-172-16-10-10". This hostname/IP is not in your /etc/hosts file. Run "hostname -f" to get the full hostname and add the following in your /etc/hosts file. 172.16.10.10 {full hostname} {hostname alias, probably ip-172-16-10-10}
... View more
04-17-2018
01:51 AM
How did you check that no process is running on port 7180? Can you post the command and the output? Also, if your /etc/hosts file is not correct, the Cloudera Manager may not start (you said that logs are ok), just take a second look.
... View more
04-17-2018
01:47 AM
Hi.First of all sorry for late reply, I was out for some time. According to this "yyyy-mm-dd hh:mm:ss[.f...]", yes you have to store it in UTC. In order to be able to store date in other timezones, the format should include the "Z" which is hours from UTC.
... View more
04-17-2018
01:36 AM
1) User hdfs does not have access to the /home/cloudera directory 2) and 3) is actually the same because in both cases you try to upload the file as user cloudera. You have two options: 1) grant read permissions to hdfs user in /home/cloudera and all sub-contents (directory access require also execute permission) 2) grant write permissions in "/inputnew/" directory in HDFS , to "cloudera" user. example: sudo -u hdfs hdfs dfs -chown cloudera /inputnew There are multiple ways to grant permissions (e.g. using ACLs), but keep it simple.
... View more
04-17-2018
01:30 AM
The fact that your server started on port 7183 does not mean that communication between Agents and Server is over SSL. If you have enabled the " Use TLS Encryption for Agents " and restarted the cloudera manager server, then you should verify that Cloudera Manager started with SSL and the certificate is correct? You can do it by : $ openssl s_client -connect cm.server:7182 Agents connect to the 7182 port. If you get a certificate as a response then you should check the Certificate Authorities you have configured in Agents' config.
... View more
04-17-2018
01:19 AM
2 Kudos
This is a mis-leading of the "free" output. The first line (starting with "Mem") displays that you have 62G of memory and 56G are used. This memory is used but not from procesess. At the end of the line, you will see a number of 39G cached. In few words, Linux uses a part of free RAM to store data from files used often, in order to save some interactions with the hard disk. Once an application request memory and there is no "free", Linux automatically drops these caches. You cannot turn this feature off. The only thing you can do is just drop the current cached data, but Linux will store something the very next second. In any case, when the output of "free" is similar to the one you provided, you should always refer to the second line "-/+ buffers/cache: 16G 49G" This is the real status, which show "16G" used and "49G" free. Finally, CM displays the disk and memory usage of the host (in Hosts view) regardless of what process is using it. It is the same output as "free".
... View more
04-17-2018
12:57 AM
Have you checked that Cloudera manager has been started? # systemctl status cloudera-scm-server and # systemctl status cloudera-scm-server-db The last one is for the embeded DB (skip it, if you have configured an external db) The output of this command will give you a brief description of the status. You can try to start it and check the logs under "/var/log/cloudera-scm-server/"
... View more
04-16-2018
09:38 AM
Although is not part of CDH there was no section for Spark, so I am posting it here. There is an issue with the manifest files in http://archive.cloudera.com/spark2/parcels/2/manifest.json and http://archive.cloudera.com/spark2/parcels/latest/manifest.json Although the parcels are Parcels The contents of the manifest refer to "parcelName": "SPARK2-2.3.0.cloudera1-1.cdh5.12.0.p0.304585-el7.parcel", If someone uses the http://archive.cloudera.com/spark2/parcels/latest as a Remote parcel repository, it will fail. The manifest in http://archive.cloudera.com/spark2/parcels/2.2.0.cloudera2/manifest.json is ok, which is strange. I thought that "latest" was a link to the latest version. Please find below screenshots of the repository Repository Manifest Please fix it. If someone wants to install Spark2.2 cloudera 2, then he should alter the parcel repository from http://archive.cloudera.com/spark2/parcels/latest/ to http://archive.cloudera.com/spark2/parcels/2.2.0.cloudera2/
... View more
Labels:
- Labels:
-
Spark
03-26-2018
02:11 AM
Logs from HUE and Hive would help. With screenshots, we can only guess what is happening. In any case, you can check two things: 1) Check that Hive Server is running. This is not the case in my opinion because you should get an error in HUE. Just double check it. Impala connects to Hive Metastore, so it is not affected if Hive Server id down. 2) And most likely in my opinion, the users that cannot see databases, are not defined in the server where HiveServer is running. So Hive Server is not able to retrieve users' group and match it with a sentry rule. If you use LDAP/AD probably you have configured sssd (or equivalent) on each host. Check that this service is running and user info is fetched from LDAP/AD. If you have manually created users, then check that group info is correct.
... View more
03-26-2018
12:34 AM
The Hive screenshot you provide, does not indicate the problem, because in this screenshot wou have "default" database selected. So the "missing" info below is not regarding databases, but not tables exist in the default database. Can you click on "default" on the top of the red box? It will move you to the upper level.
... View more
03-23-2018
05:47 AM
1 Kudo
That's because the "/tmp" directory in your HDFS does not allow access to hive. In general "/tmp" should have 777 permissions, because most of the services use it as a temp directory to store various info. So, if there is no reason to restrict accesss, the following command will solve your problem. hdfs dfs -chmod 777 /tmp
... View more
03-22-2018
09:16 AM
In general, admin groups are added in sentry's configuration -> https://www.cloudera.com/documentation/enterprise/latest/topics/sg_sentry_service_config.html#concept_z5b_42s_p4__section_vrc_1dk_55 Your issue is that hive user is not configured properly. Did you manually created users? If you run "id hive" in your system, the user hive should belong to group hive. If you fix this, probably you should be able to set sentry rules as hive, because by default hive group should be defined as admin group. If you are not using Cloudera Manager, then set "sentry.service.admin.group"
... View more
03-22-2018
02:15 AM
According to the error CDH is replacing Kudu 1.4 (probably you had installed kudu in the past). This is because Kudu is now part of the CDH, so you don't need to deploy additional parcel. I am copying from: https://www.cloudera.com/documentation/enterprise/latest/topics/install_upgrade_to_cdh5x_parcels.html#concept_rv5_kwq_rx If your cluster has Kudu 1.4.0 (or lower) installed, deactivate the existing Kudu parcel. Starting with Kudu 1.5.0 / CDH 5.13, Kudu is part of the CDH parcel and does not need to be installed separately.
... View more
03-20-2018
08:11 AM
1 Kudo
The error complains about the value of "hadoop.security.authentication". You have set it to "Kerberos" while the accepted values are "simple" and "kerberos" (all letters in lowercase).
... View more
03-20-2018
08:07 AM
1 Kudo
As a first step, try to restart the "Cloudera Management Service" According to the message you have upgraded Cloudera Manager Server and Agents, but Host Monitor is running on the old version.
... View more
03-20-2018
08:01 AM
I am not sure I understand your request. What I understand: * You have a Java application that writes data to a delimited file. * This file is somehow loaded into Hive (e.g. load data inpath 'file.csv' into table output_table) And the question is what format to use when writing to file? If this is correct, if you have defined this column as timestamp, then the output should be one of the supported formats of timestamp for Hive https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-TimestampstimestampTimestamps If you have define it as string/varchar then you can use any format you want and then convert it to seconds from unix epoch in hive by using the built-in functions.
... View more
03-19-2018
04:55 AM
Hive has builtin function for these conversions: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF You should give a try to: unix_timestamp(string date, string pattern)
Convert time string with given pattern (see [http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html]) to Unix time stamp (in seconds), return 0 if fail: unix_timestamp('2009-03-20', 'yyyy-MM-dd') = 1237532400. Where you can define the pattern of the timestamp you provide (including the timezone info)
... View more
03-16-2018
08:22 AM
1 Kudo
According to ther error you posted, your NameNode is in Safe Mode. There should be an Active Name Node in order to perform any HDFS action.
... View more
03-16-2018
08:17 AM
This is not 100% accurate. First of all, the link you provided is for an old version of CDH, the CDH5.0. Latest CDH5.0 was released on 2015. The current version is CDH5.14, so the latest list is : https://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_514.html#cm_vd_cdh_package_tarball_514 Moreover, Cloudera selects a base version of the components and adds selected patches from Apache, as you can see in the link below ( I provide the CDH5.13 because the latest returns an empty page): https://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.13.0.releasenotes.html As you can see, there is "hive-1.1.0+cdh5.14.0+1330", which means that CDH's Hive is based on version 1.1.0 plus 1330 patches after that. For more details, refer to : https://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball.html#concept_tm5_gx3_dp
... View more
03-15-2018
03:27 AM
Have you tried using "-h clouderam" in your command? Permissions (in MySQL) differ whether you are connected as 'user'@'localhost' or 'user'@'clouderam' By using "-h clouderam" you are connected as 'user'@'clouderam' and you should grant permissions to this user.
... View more
03-15-2018
01:01 AM
This is a MySQL issue. Have you checked that mysql is running and you are able to connect?
... View more
03-14-2018
09:15 AM
Check this out https://www.cloudera.com/documentation/enterprise/5-13-x/topics/cm_ig_create_local_package_repo.html#concept_qqw_ww3_gp It allows you to create a temporary repository without installing any software. Suppose that you have python installed.
... View more