Member since
07-06-2018
59
Posts
1
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1537 | 03-05-2019 07:20 AM | |
1707 | 01-16-2019 09:15 AM | |
1050 | 10-25-2018 01:46 PM | |
1217 | 08-02-2018 12:34 PM |
07-14-2020
11:52 AM
Is there a way to poll parcel distribution progress/completion using cm api ?
... View more
Labels:
- Labels:
-
Cloudera Essentials
-
Cloudera Manager
08-21-2019
11:48 AM
Hi network, Can anyone list scenarios where in Impala queries goes in "query waiting to be closed stage". i was checking one of IDs webUI and found bunch of queries listed under that section. 1. Why queries end up in that category, i guess timeout controls it(yet to do a bit of reading on that ). 2. Is it safe to close them or does that lead to any problems? Regards
... View more
- Tags:
- apache-impala
Labels:
- Labels:
-
Apache Impala
08-21-2019
11:28 AM
Thanks that does show more information. Though what i find weird is the same query has run with a large load earlier (with same config params) and now has failed (from the logs: java.lang.OutOfMemoryError: Java heap space). Regards
... View more
- Tags:
- Spark
08-20-2019
01:35 PM
Hi ,
Has anyone come across, below error and can share a common cause since the error message looks very generic:
ERROR Lost executor 12 on host123: Container marked as failed: container_e218_1564435356568_349499_01_000013 on host: host123. Exit status: 143. Diagnostics: Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
... View more
- Tags:
- Spark
Labels:
- Labels:
-
Apache Spark
08-14-2019
06:14 PM
On a side note not related: Is there no delete post option in this comment section? Also it allows an empty post 😄 Peace
... View more
08-14-2019
06:12 PM
Thanks for the descriptive response. -- Are there any specific reason you need the insecure HS2 endpoint? Reason of doing this is we are moving over to enable TLS on HS2. As soon as we do it our larger user base will immediately get impacted(Since they'll have to change/update beeline connection string for ex), in order to avoid making them go through a hard cut over, we wanted to try a soft cutover. By soft cut over I mean spin another replica of Hive service and then ecrypt current/original hive service and let the user use the new one and slowly they can move to the encrypted one. But based on below description among other blockers to this approach the one that really stops me from doing is the hdfs location point, I guess its not possible for them to be exact replica? Like Hive1 , Hive2 have exact same database/table data.? Regards
... View more
08-12-2019
04:37 PM
I'm afraid it doesn't , TLS is a service wide property in Hive, it's not specific/bound to particular roles to perform above.
... View more
08-12-2019
12:23 PM
Thanks, can you share details of the process? Some of the questions/problems that i have:- A new hive instance if pointed to the same backend database(of an existing Hive service), will it overwrite any tables that already exist there? (Since when a new instance is spun it creates tables under selected DB.) A new instance when spun requires to have default hive location mentioned in hdfs, and can't use existing Hive's default locaiton /user/hive/warehouse and has to be something else. A new instance of hive in parallel with existing Hive service can't use Sentry, sentry supports use of only single hive service. Above is what i noticed which defeats the purpose, intention is to use other hive service and cater same customers that the original service support , but with different default hdfs location, absence of Sentry and maybe more scenarios that i didn't already found or come across are blocking me from going ahead with this setup. Do you have a way to bypass all of this such that both hive services are identical (Except encryption status). Regards
... View more
08-12-2019
08:13 AM
Hi, Can the forum please address/answer: Is it possible to have two Hive services installed in one Hadoop cluster such that one of the Hive service (HS2) is encrypted and the other is not . Also is it possible to have both services pointing to same HMS? Regards
... View more
Labels:
- Labels:
-
Apache Hive
08-09-2019
08:45 AM
Thanks for confirming that. We'll enable for Impala as well but after a week or so, but wanted to know if in the meantime it'd still work or not.
... View more
08-09-2019
07:52 AM
Hi Team, We are in process on enabling TLS on HS2, I wanted to know/clarify if that happens will Impala be affected? Impala is dependent on HMS and TLS is only limited to hive server 2, given that having TLS enabled on Hive and running Impala without it should be ok or do you see a scenario where this combination may not work? Regards
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Impala
08-08-2019
08:36 AM
Hi @bgooley While we are discussing this topic, is there a way(An API call) to get service health running on each host? For ex: Like how Cloudera Manager lets me list hosts from host tab and show services running on a particular host with health status (green, amber or red). Regards
... View more
08-08-2019
07:03 AM
Thanks @bgooley Not sure I'll be able to integrate that part but thats good to know . Can you please share JIRA number for my reference. Regards
... View more
08-07-2019
10:42 AM
Hi All, Is there an inhouse CM API call to pull out config history details like you can fetch using "history and rollback" tab under configuration of any service in Cloudera Manager. Regards
... View more
Labels:
- Labels:
-
Cloudera Manager
08-02-2019
09:05 AM
Thanks, to be on the same page taking help of below scenario: hdfs snapshottable location /a/b/ has a file c which is snapshotted. Consider a scenario where c is deleted from hdfs using cli hdfs -rm -r -skipTrash (NN transaction happened and hdfs cli command doesn't show up the file anymore) and then a new file is created with same content/size and name. - What gets stored in hdfs? whats the delta that snapshot add in hdfs in this case? --> is it just that snapshot still holds c as block in hdfs in addition to the same file that was created in hdfs --> NN resource used to maintain both of their metadata in heap? is this all or there is more to it . Regards
... View more
08-01-2019
09:56 AM
Thanks Arpit, To clarify what is the magnitude of the size we are talking about here when you mention - " directory will need to be tracked as deltas and that can result in both higher disk space and NameNode heap usage ". I'm assuming you mean just to store the metadata of the changed snapshot and which isn't significant given the actual size of data held(in reference to my example above) , if not please clarify Regards
... View more
08-01-2019
08:59 AM
HI,
Can someone confirm if snapshot enabled on a location occupy space in HDFS ?
For ex: HDFS location: /a/b/c is the only location in HDFS and occupies 9 TB post replication(3x).
it looks like:
3.0 9.0T /a/b/c
Question: If after enabling snapshot on this location will total HDFS utilization at cluster level increase to 18.0 TB?
Regards
... View more
- Tags:
- HDFS
Labels:
- Labels:
-
Apache Hadoop
07-30-2019
01:59 PM
Hi everyone, I'm using command in subject to fetch all yarn applications and use grep on the output to filter out a specific application . I wanted to find out what is the limit of this command, as in how far in the history does it go to get these states. Is this limited by the job history server limit or does it return something below that like last 1000 jobs etc? Regards
... View more
Labels:
- Labels:
-
Apache YARN
06-26-2019
10:10 AM
Thanks , I was using that endpoint for GET so far, didn't realised could use POST calls on that as well. I'll give it a go and get back.
... View more
06-13-2019
11:57 AM
Hi Team, I have been working with CM API lately to do some automations. One of the task that I have now is to be able to automate role instance addition on an existing Cluster host, for ex: spining a nodemanager on a cluster host. I tried searching for CM API calls for the same but couldn't find anything directly related , maybe you know? Thanks in advance Regards
... View more
Labels:
- Labels:
-
Cloudera Manager
06-05-2019
07:15 AM
Hey Network, Anyone had this issue or maybe Cloudera team in this community may share if this a known bug etc? Regards
... View more
06-03-2019
01:55 PM
I have a scenario where i'm trying to create a table which points to an HDFS location which has a directory name starting with an "_" in the HDFS path. Now table creation goes through but If I try to read data out of the table it throws error, below is what i get: create external table `ingest.workgroup__views2` row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' location 'hdfs://nameservice1/user/data/ingest/mdm/workgroup_i/workgroup/_views' tblproperties ('avro.schema.url'='hdfs://nameservice1/user/data/ingest/mdm/workgroup_i/workgroup/_views/_gen/_views.avsc'); No rows affected (0.232 seconds) 0: jdbc:hive2://t-hive.sys.cigna.com:25006/de> select * from ingest.workgroup__views2; Error: java.io.IOException: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://nameservice1/user/data/ingest/mdm/workgroup_i/workgroup/_views (state=,code=0) 0: jdbc:hive2://t-hive.sys.cigna.com:25006/de> drop table ingest.workgroup__views2; So i escape the special character "_" in location and the table gets created and i' able to run select to see data as below: create external table `ingest.workgroup__views2` row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' location 'hdfs://nameservice1/user/data/ingest/mdm/workgroup_i/workgroup/\_views' tblproperties ('avro.schema.url'='hdfs://nameservice1/user/data/ingest/mdm/workgroup_i/workgroup/_views/_gen/_views.avsc'); No rows affected (0.19 seconds) 0: jdbc:hive2://t-hive.sys.cigna.com:25006/de> select * from ingest.workgroup__views2; +-----------------------+-------------------------+-----------------------------+-------------------------------+-----------------------------+-------------------------------+--------------------------------+--------------------------+--------------------------+----------------------------+----------------------------+--+ | workgroup__views2.id | workgroup__views2.name | workgroup__views2.view_url | workgroup__views2.created_at | workgroup__views2.owner_id | workgroup__views2.owner_name | workgroup__views2.workbook_id | workgroup__views2.index | workgroup__views2.title | workgroup__views2.caption | workgroup__views2.site_id | +-----------------------+-------------------------+-----------------------------+-------------------------------+-----------------------------+-------------------------------+--------------------------------+--------------------------+--------------------------+----------------------------+----------------------------+--+ +-----------------------+-------------------------+-----------------------------+-------------------------------+-----------------------------+-------------------------------+--------------------------------+--------------------------+--------------------------+----------------------------+----------------------------+--+ No rows selected (0.139 seconds) Now the weird part is its only the location part which has this issue, parsing of URI mentioned under tblproperties goes through as you can see above and if I explicitly try to escape "_" in tblproperties it doesn't work. Any comments or suggestions will be helpful on the above obesrvation. Regards
... View more
Labels:
- Labels:
-
Apache Hive
-
HDFS
03-05-2019
07:20 AM
Ways to change the pools via API today: Use the PUT call of http://$HOSTNAME:7180/api/v19/clusters/<cluster>/services/<yarn>/config, to change yarn_fs_scheduled_allocations, followed by a POST to refresh pools (http://$HOSTNAME:7180/api/v19/clusters/<cluster>/commands/poolsRefresh) Pros: It does update the pools, as desired. It does NOT affect the web UI Cons: The JSON is complex and prone to typos. A typo could mess up all pools and cause issues on the cluster
... View more
01-16-2019
09:15 AM
Turns out, its a limitation as of now. Updates made to resource pools using CM API is known to break DRP UI. There are improvement tickets open internally with Cloudera to address that.
... View more
01-09-2019
08:20 AM
Specifically looking for : Is this doable through the CM API? Can I adjust the weights, memory, CPU of a Yarn resource pool? Can I adjust create and delete pools? Can I do the same allocations for Admission Control in Impala?
... View more
01-08-2019
02:19 PM
1 Kudo
Hi Team,
Can anyone share if they have updated resource pool configurations in YARN using CM API, what end points were used if so. Thanks in advance.
... View more
Labels:
- Labels:
-
Apache Impala
-
Apache YARN
-
Cloudera Manager
12-07-2018
12:11 PM
Hi can anyone confirm if the post from HDP support can be applied in CDH environment as well: dbms.py is available under: /opt/cloudera/parcels/CDH/lib/hue/apps/beeswax/src/beeswax/server and has: def get_indexes(self, db_name, table_name): hql = 'SHOW FORMATTED INDEXES ON `%(table)s` IN `%(database)s`' % {'table': table_name, 'database': db_name} query = hql_query(hql) handle = self.execute_and_wait(query, timeout_sec=15.0) if handle: result = self.fetch(handle, rows=5000) self.close(handle) return result def get_functions(self, prefix=None): filter = '"%s.*"' % prefix if prefix else '".*"' hql = 'SHOW FUNCTIONS %s' % filter query = hql_query(hql) handle = self.execute_and_wait(query, timeout_sec=15.0) if handle: result = self.fetch(handle, rows=5000) self.close(handle) return result
... View more
11-29-2018
09:48 AM
Hi All, Is there a way to list more than 5000 tables in a database. By default Hue shows first 5000 tables from a database, is there a configuration change supplied through snippet by which we can override this. Found a related article not sure if this can be applied in CDH 5.14 as well? https://community.hortonworks.com/articles/75938/hue-does-not-list-the-tables-post-5000-in-number.html Regards
... View more
Labels:
- Labels:
-
Cloudera Hue
10-25-2018
01:46 PM
Found a solution to this, had to get configuration on role level which prints everything set in CM. https://hostname:7183/api/v19/clusters/cluster/services/sentry/roles/role-name/config?view=full
... View more
10-25-2018
12:49 PM
Hi Team, I was looking for a way to change some cluster configs (CDH 5.14) and realized that CM API calls for viewing configurations for a particular service doesn't return all the configurable items . For ex: I ran below: https://hostname:7183/api/v19/clusters/cluster/services/sentry/config?view=full But this didn't capture Java heap set for Sentry process. Whats the best way to see entire configuraton items for particular role/service? Thanks in advance.
... View more
Labels:
- Labels:
-
Cloudera Manager