Member since
12-06-2022
32
Posts
2
Kudos Received
1
Solution
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2752 | 06-08-2023 11:41 PM |
05-14-2023
09:48 PM
I have a cluster of 15 nodes and I'm a bit headache when try to assign nodes. I'm following this Role distribution instruction on Cloudera. But I wonder can I just add a Gateway role to every node for some services? Even if it's name node, master node or whatever? I just want to make sure that everyone can access services on every node in the cluster. I always wonder something like "this is a hive name node, do I need to add Gateway role?, oh that is a hbase master node, do I need to add Gateway role?,....". Now I don't want to think too much. Is there any performance issue, or incompatibility issue if I add Gateway role to every node for every service just for the sake on simplicity?
... View more
Labels:
- Labels:
-
Cloudera Manager
05-14-2023
07:53 PM
I successfully installed it on 3 nodes. Normally you only need to install everything on 1 node (things like java, python you have to install on 3 nodes first of course). When go to the CM UI website, you can add another node and Cloudera will automatically install everything for you. In case you want to install things manually. Install all 3 packages "cloudera-manager-daemons", "cloudera-manager-agent", "cloudera-manager-server" on your main node, and for other nodes only install "cloudera-manager-daemons" "cloudera-manager-agent" and start these agent services. After that, you will see that two nodes are "managed" on the CM UI, meaning that you can skip the "Install Agent" step (since you've already installed "cloudera-manager-agent" and start it)
... View more
05-11-2023
06:32 AM
I have a cluster of 3 nodes (all brand new with centos7, no java, no MySQL, nothing at all). I'm following this official install guide to install CDH 6.2.0 on the first node (called node1). Everything was fine, but do I need to install everything in the guide the same way for node2 and node3? I mean do I need to run sudo yum install cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server and sudo /opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm and all other commands in all 3 nodes? The instruction is unclear. I read some article on the internet that said I only need to install "cloudera-manager-agent", "cloudera-manager-daemons" on all nodes, "cloudera-manager-server" only need to install on 1 node or something like that. Which step should I execute on all nodes, and which step I only need to execute on 1 node? Or do I only need to install cdh on 1 node, then I use the Cloudera Manager UI website and add a new node, it will automatically install everything on that new node?
... View more
Labels:
- Labels:
-
Cloudera Manager
02-01-2023
05:49 PM
I'm using a tool in which I have to point out the master node (driver node) of the Cloudera Spark Cluster (spark :// <some-spark-master> : 7077). Also as I learned, Spark has "Master Node", "Driver Node" and "Worker Nodes". So I decided to go to the Cloudera Web Manager and checked the Configuration Tab of the Spark service, but all I found are "Gateway instance" and "History Server instance". Where are the "Driver instance" and "Worker instance"? I can't add these two instances in the "Add Role Instances" too My guess is that it's in Yarn service configuration, but I can't find anything related to "Master", "Driver" or "Worker" either. So what is the link to "Spark Master" that ends with 7077 (what is the Node)? I can't find it anywhere in the Configuration tab
... View more
Labels:
- Labels:
-
Apache Spark
01-31-2023
06:14 PM
I'm using a tool in which I have to point out the master node (driver node) of the Cloudera Spark Cluster (spark :// <some-spark-master> : 7077). Also as I learned, Spark has "Master Node" (Driver Node) and "Worker Nodes". So I decided to go to the Cloudera Web Manager and checked the Configuration Tab of the Spark service, but all I found are "Gateway instance" and "History Server instance". Where are the "Driver instance" and "Worker instance"? I can't add these two instances in the "Add Role Instances" too My guess is that it's in Yarn service configuration, but I can't find anything related to "Master"/"Driver" or "Worker" either. So what is the link to "Spark Master" that ends with 7077? I can't find it anywhere in the Configuration tab
... View more
Labels:
- Labels:
-
Apache Spark
12-28-2022
10:55 PM
What is Kafka Gateway and Kafka MirrorMaker when add Role Instances to Kafka? I created a Kafka service on Cloudera Manager. Now I want to add a new Kafka broker instance inside the Kafka service. I followed the guide on the internet. Choose Instances -> Add Role Instances A new window comes up. But I notice that I can only add Kafka Brokers Instances (this guide said that I can add Kafka Connect Instance too). Also, there are two other instances called Gateway and Mirrormaker, which I don't know what are they? I search google but only find some info about Kafka MirrorMaker but no luck on finding anything about Kafka Gateway
... View more
Labels:
- Labels:
-
Apache Kafka
12-20-2022
06:08 PM
Hi. I want to know Where is the Jar folder for Spark in Cloudera? In my previous company, we just put all the needed jars inside $SPARK_HOME/jar folder (on every node), so that we don't have to worry much about the --jars, --packages,... when running the spark-submit job. Also, it saves lots of disk space and time since we don't need to include every package when building a jar. But in my new company, which uses Cloudera, I don't know where is this jar folder. I found 2 places (maybe not the right one): - /opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/jars - /opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark/jars Where should I put the needed jars file on? It seems like all the jars file on the second one are linked to the first one. Also, I found dozens of lib/jar folders everywhere in Cloudera. Or is there any other way to do this with Cloudera? I read some guides about modified Spark configs on the Cloudera manager on the internet.
... View more
Labels:
- Labels:
-
Apache Spark
12-08-2022
08:15 PM
Nah I figure it out. First, go to /etc/spark/conf.cloudera.spark_on_yarn/classpath.txt then delete the last line (which contains the path to hbase-class.jar). Then you download hbase-spark-1.0.0.7.2.15.0-147.jar, then when you run spark-shell, add --jars pathToYourDownloadedjar, then you add option("hbase.spark.pushdown.columnfilter", false) before load data like this: val sql = spark.sqlContext val df = sql.read.format("org.apache.hadoop.hbase.spark").option("hbase.columns.mapping", "name STRING :key, email STRING c:email, " + "birthDate STRING p:birthDate, height FLOAT p:height").option("hbase.table", "person").option("hbase.spark.use.hbasecontext", false).option("hbase.spark.pushdown.columnfilter", false).load() df.createOrReplaceTempView("personView") val results = sql.sql("SELECT * FROM personView where name = 'alice'") results.show()
... View more
12-06-2022
06:02 PM
I read a Document guide of Cloudera on this Schedule Job link. The problem is I don't have access to "Cloudera Data Platform (CDP) management console" (which looks like below): . I only have access to Cloudera Web server UI on xxx.xxx.xxx.xxx:7180, which looks like this: Please note that we run Cloudera on a host machine that runs centos, and I have to ssh to that machine there isn't any UI like the first picture, there is only a black window with line commands. I only have the webserver. Same problem for many other guides on the website, they're always require you have access to CDP management console UI
... View more
Labels:
- « Previous
-
- 1
- 2
- Next »