Member since
12-11-2015
199
Posts
29
Kudos Received
30
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
346 | 08-14-2024 06:24 AM | |
1304 | 10-02-2023 06:26 AM | |
1208 | 07-28-2023 06:28 AM | |
7583 | 06-02-2023 06:06 AM | |
605 | 01-09-2023 12:20 PM |
03-25-2020
11:06 PM
These app cache directories gets auto generated upon job submission - So can you remove them from nodemanagers [so that it gets created fresh with required acls] /disk{1,2,3,4,5}/yarn/nm/usercache/mcaf and then re-submit the job again
... View more
03-23-2020
03:59 AM
"although same property (dfs.datanode.balance.max.concurrent.moves) already exists in Cloudera Manager." --> Okay, I assume you are referring to the one highlighted in screenshot below Yes its unnecessary to add dfs.datanode.balance.max.concurrent.moves in Balancer Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml if you had used the "Maximum Concurrent Moves" section. Also note that this "Maximum Concurrent Moves" is scoped only to balancer and not to datanodes. So for datanodes you have to explicitly set it using " DataNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml" Regarding reason for why to add this property both for balancer and datanode is mentioned in my previous comment. Hope that clarifies and let me know if there are further questions I will raise an internal jira for correcting the document to avoid duplicate entry on balancer safety-valve.
... View more
03-22-2020
11:32 PM
Yes you can install CM offline after downloading the packages and - Its documented in this link https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_ig_create_local_package_repo.html#internal_package_repo Once the repo is ready you can install the binaries using the steps in link https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/install_cloudera_packages.html#id_z2h_pnm_25
... View more
03-22-2020
10:09 PM
This error usually happens if the client doesnt match the QOP on server. Can you share the connection string used in your code snippet? Is your hiveserver2 kerberised? Can you please share what is the value set for this property hive.server2.thrift.sasl.qop in your hiveserver2's hive-site.xml? Example connection string is in link https://github.com/dropbox/PyHive/pull/135/files/ec5270c4b6556bcd20f0f81afbced4a69ca9eff0
... View more
03-22-2020
08:48 PM
You would need to tune your heap in accordance with the number of files. The tuning guideline is in document https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.5/bk_command-line-installation/content/configuring-namenode-heap-size.html If you would like to get count of files, You may run hdfs dfs -count /
... View more
03-22-2020
08:16 PM
Just a correction The document suggest to tune property dfs.datanode.balance.max.concurrent.moves and not dfs.datanode.ec.reconstruction.xmits.weight Regarding the question of dfs.datanode.balance.max.concurrent.moves is already present in Datanode and balancer so why to add again. The doc says "Add the following code to the configuration field, for example, setting the value to 50." i.e 50 is just a example number and the document doesnt mandate setting this value to 50. You can tune it to any value of your requirement. Then why to add in both balancer and datanode? Setting it on HDFS Balancer(client) will give the flexibility to change this value on the client side at runtime i.e you can set this property to a value lesser or equal to what you have configured on the datanode side. Reason why we set this on server side is to impose a limit till what value the property can be configured. If you configure a value greater than what you have set on the Datanode(server), the datanodes fails it
... View more
03-22-2020
06:32 AM
The error suggests the DFSClient is unable to read the blocks due to connection failure. Either the ports are blocked or unreachable from the node From the node in which you are running the code snippet/From the node in which the executor ran, try reading the file using hdfs commands in debug mode which can give further clues on what node/service the client was trying to reach prior to connect timeout export HADOOP_ROOT_LOGGER=DEBUG,console
hdfs dfs -cat hdfs://ec2-18-234-71-106.compute-1.amazonaws.com:8020/dataset/Tech.csv
... View more
03-22-2020
06:09 AM
@erkansirin78 Let me make sure I understand the issue correctly. By this " Before restart, I saw totally different properties added." Did you mean the property dfs.datanode.ec.reconstruction.xmts.weight getting added? If yes, then its not getting added instead the preview page is just showing the extra lines prior to the property that you added, only the lines with + sign matters.
... View more
03-16-2020
01:44 AM
Yeah, thats right. Unfortunately there is no feature available to gather Nifi Lineage in Navigator.
... View more
03-16-2020
01:22 AM
Right, In CDH its not available but with CDP you have options to install Atlas which already have integration with NiFi https://docs.cloudera.com/cdpdc/7.0/overview/topics/cdpdc-overview.html Data Engineering Ingest, transform, and analyze data. Services: HDFS, YARN, YARN Queue Manager, Ranger, Atlas, Hive Metastore, Hive on Tez, Spark, Oozie, Hue, and Data Analytics Studio Data Mart Browse, query, and explore your data in an interactive way. Services: HDFS, YARN, YARN Queue Manager, Ranger, Atlas, Hive Metastore, Impala, and Hue Operational Database Low latency writes, reads, and persistent access to data for Online Transactional Processing (OLTP) use cases. Services: HDFS, Ranger, Atlas, and HBase https://docs.cloudera.com/HDPDocuments/HDF3/HDF-3.4.1.1/installing-hdf/content/configure_nifi_for_atlas_integration.html
... View more