About Prav

paras · ‎07-22-2020

@Prav You can leverage CM API to track parcel distribution status: /api/v19/clusters/{clusterName}/parcels - This can be used to note the parcel name and version the cluster has access to /api/v19/clusters/{clusterName}/parcels/products/{product}/versions/{version} - This can be used to track the parcel distribution status Refer below link for more details http://cloudera.github.io/cm_api/apidocs/v19/path__clusters_-clusterName-_parcels_products_-product-_versions_-version-.html Hope this helps, Paras Was your question answered? Make sure to mark the answer as the accepted solution. If you find a reply useful, say thanks by clicking on the thumbs up button.

SahilTakiar · ‎09-24-2019

Queries are in the "waiting to be closed" stage if they are in the EXCEPTION state or if all the rows from the query have been read. In either case, the query needs to be explicitly closed for it to be "completed". https://community.cloudera.com/t5/Support-Questions/Query-Cancel-and-idle-query-timeout-is-not-working/td-p/58104 might be useful as well.

Prav · ‎08-21-2019

Thanks that does show more information. Though what i find weird is the same query has run with a large load earlier (with same config params) and now has failed (from the logs: java.lang.OutOfMemoryError: Java heap space). Regards

Prav · ‎08-09-2019

Thanks for confirming that. We'll enable for Impala as well but after a week or so, but wanted to know if in the meantime it'd still work or not.

Prav · ‎08-02-2019

Thanks, to be on the same page taking help of below scenario: hdfs snapshottable location /a/b/ has a file c which is snapshotted. Consider a scenario where c is deleted from hdfs using cli hdfs -rm -r -skipTrash (NN transaction happened and hdfs cli command doesn't show up the file anymore) and then a new file is created with same content/size and name. - What gets stored in hdfs? whats the delta that snapshot add in hdfs in this case? --> is it just that snapshot still holds c as block in hdfs in addition to the same file that was created in hdfs --> NN resource used to maintain both of their metadata in heap? is this all or there is more to it . Regards

_Umesh · ‎07-30-2019

Since the "list" commands gets the apps from the ResourceManager and doesn't set any explicit filters and limits (except those provided with it) on the request, technically it returns all the applications which are present with RM at the moment. That number is controlled by "yarn.resourcemanager.max-completed-applications" config. Hope that clarifies.

Robert Justice · ‎06-06-2019

@Prav , This appears to have been listed as a bug (which is actually a longstanding limitation due to the definition of files and directories with _ and . being considered as "hidden" in FileInputFormat in Hadoop) of Hive since the 0.12 version: https://issues.apache.org/jira/browse/HIVE-6431 https://stackoverflow.com/questions/19830264/which-files-are-ignored-as-input-by-mapper If these files are needed to be seen, please consider using a pre-process script to rename them after loading. Thanks,

Prav · ‎03-05-2019

Ways to change the pools via API today: Use the PUT call of http://$HOSTNAME:7180/api/v19/clusters/<cluster>/services/<yarn>/config, to change yarn_fs_scheduled_allocations, followed by a POST to refresh pools (http://$HOSTNAME:7180/api/v19/clusters/<cluster>/commands/poolsRefresh) Pros: It does update the pools, as desired. It does NOT affect the web UI Cons: The JSON is complex and prone to typos. A typo could mess up all pools and cause issues on the cluster

lwang · ‎12-13-2018

Hi @Prav, Unfortunately, there is no officially supported way to increase the number of tables loaded in Hue. However, we do currently have a feature request to improve on this behavior. In the meantime, you can workaround this by: distribute the tables in multiple DBs (recommended) manually adjust the 'max_rows' limit in hive_server2_lib.py as shown in below. However, keep these implications in mind before you do that: The more the limit is increased, the more it will impact the performance of Hue. If something goes wrong with Hue, this change would potentially make troubleshooting difficult. The next time CDH is upgraded (even to a maintenance release), a new copy of hive_server2_lib.py will be installed, and change will have to be made again. Before making the change, hive_server2_lib.py should be backed up. Here is the sample code reference from /opt/cloudera/parcels/CDH/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py: ----------------- def get_tables(self, database, table_names, table_types=None): if not table_types: table_types = self.DEFAULT_TABLE_TYPES req = TGetTablesReq(schemaName=database, tableName=table_names, tableTypes=table_types) res = self.call(self._client.GetTables, req) results, schema = self.fetch_result(res.operationHandle, orientation=TFetchOrientation.FETCH_NEXT, max_rows=5000) self.close_operation(res.operationHandle) return HiveServerTRowSet(results.results, schema.schema).cols(('TABLE_NAME',)) ----------------- Hope this helps, Li Cloudera Employee

Prav · ‎10-25-2018

Found a solution to this, had to get configuration on role level which prints everything set in CM. https://hostname:7183/api/v19/clusters/cluster/services/sentry/roles/role-name/config?view=full

Online	Offline
Last Visited	‎09-09-2020 05:41 PM

Member Since	‎07-06-2018 12:26 PM
Last Visited	‎09-09-2020 05:41 PM
Posts	59
Kudos received	1

Cloudera Community

Re: Update YARN resource pools using cm api

Re: Update YARN resource pools using cm api

Re: CM API call doesn't capture/show all results

Re: Actual user not shown for a Query/Job in progr...

Re: Poll parcel distribution status

Re: Impala queries - waiting to be closed

Re: Spark job fails without much information

Re: Non-Secure Impala with Secured Hive

Re: Does snapshot occupy space in HDFS.

Re: How deep does command "yarn application -list ...

Re: Hive table doesn't detect input HDFS location ...

Re: Update YARN resource pools using cm api

Re: Hue table browser limit - 5000

Re: CM API call doesn't capture/show all results