Member since
07-30-2013
509
Posts
113
Kudos Received
124
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2638 | 07-09-2018 11:54 AM | |
2216 | 05-03-2017 11:03 AM | |
5155 | 03-28-2017 02:27 PM | |
1983 | 03-27-2017 03:17 PM | |
1755 | 03-13-2017 04:30 PM |
09-05-2014
01:41 PM
Use the API command to create the HDFS temp dir. Were you not able to find it in your version of the API?
... View more
09-05-2014
01:11 PM
If you click on the Hive Metastore and look in the processes tab, you should be able to find the stderr. You do not need to manually modify hive-site to add the connection URL. You are probably looking at the wrong copy hive-site.xml. The one used by the metastore is also shown on the processes tab. The java heap size charts should be visible if you click on the Hive Metastore Server and then on Charts Library and look for the Resident Memory chart.
... View more
09-05-2014
01:04 PM
Did you do the step to create the HDFS /tmp directory? This is described via command line in the blog post you linked, but there is also an API command to do this. Those instructions are fairly dated. You should manually set up a cluster, look at all the steps performed by First Run when initially setting up the cluster, and make sure you do all of those steps.
... View more
09-05-2014
09:51 AM
You should not have needed to modify hive-site.xml. What instructions did you follow? Is there any information in the role's stderr log? Do the Java heap size charts look troubling around the time of the failure?
... View more
09-05-2014
07:47 AM
Glad you solved this! Keep in mind that when using the symlink, you may need to re-create it whenever you upgrade your cloudera-scm-server-db package in the future, since the symlink confuses the packaging code. Thanks, Darren
... View more
08-27-2014
04:41 PM
Hi, Sentry service stores policy information in a relational database, whereas the Policy File implementation uses a file. You should never use both at the same time as that would be redundant. When using Sentry service, you issue grants and revokes via the HiveServer2 client beeline. The descriptions for the Sentry configuration in the CM UI have links to documentation explaining the usage, which should answer all of your questions. Thanks, Darren
... View more
08-26-2014
01:32 PM
Hi, You need to be using CDH5.1 or higher, and make sure CM knows that it is CDH5.1 or higher. On the home page, CM will report what it believes the cluster's version is. If that doesn't say 5.1, then you need to fix it by either installing the correct CDH version or configuring the version in CM, as described here: http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM5/latest/Cloudera-Manager-Managing-Clusters/cm5mc_config_package_version.html
... View more
08-25-2014
01:53 PM
See example here: http://cloudera.github.io/cm_api/docs/python-client/#managing_parcels
... View more
08-19-2014
12:57 PM
Yes, that's basically the reason. You also wouldn't have to leave the data directory in the hardcoded location that CM uses, which doesn't always work well for folks as the database grows large.
... View more
08-19-2014
12:01 PM
3 Kudos
Hi, In general, we suggest that you use an external database for production. The embedded database is just handy for getting started. The embedded database is just a regular postgresql that is started by custom init scripts on a custom port in a custom data directory. When creating the Hive service, the wizard will prompt you if you'd like to use the embedded database or an external one. If you use the embedded database, a user role and a database will be created with the correct permissions for you automatically. If you use an external one, you must do these steps yourself and provide the host / port / database name / username / password. The CM documentation does not say to do a yum install of postgres. You just use the Cloudera Manager UI and click the Add Service option in the dropdown menu by your cluster name, same as adding any other service. For smaller clusters, it's fine to consolidate onto a single database. As your load grows, you'll want to migrate some databases and possibly their roles to different hosts. I wouldn't run two PostgreSQL on the same host as that will just consume more RAM than is really needed. It would be better to consolidate onto the external PostgreSQL, as the embedded one is not intended for production. Thanks, Darren
... View more