Member since
01-19-2017
3676
Posts
632
Kudos Received
372
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 488 | 06-04-2025 11:36 PM | |
| 1014 | 03-23-2025 05:23 AM | |
| 538 | 03-17-2025 10:18 AM | |
| 2027 | 03-05-2025 01:34 PM | |
| 1264 | 03-03-2025 01:09 PM |
10-26-2020
02:32 PM
@ParthiCyberPunk Unfortunately, you didn't share the connect string. below is an example you could use jdbc:hive2://host:10000/DB_name;ssl=true;sslTrustStore=$JAVA_HOME/jre/lib/security/certs_name;trustStorePassword=$password Substitute host,port, truststore location and certificate name and password accordingly. Keep me posted
... View more
10-26-2020
11:47 AM
@anhthu You will need to fire-up your cluster start by scrolling at the bottom see the attached screen and start the Cloudera Manager [CM] the blue triangular shape will give you a drop-down menu chose start you will see some startup logs and if all goes well it will be green You can see exactly the same error I have on my Quickstart sandbox because my service are not started again see the attached screenshot on the blue inverted triangle you will see a drop-down list chose start the services. This will start all the service in the right order ,again once all is GREEN you are good to go Happy Hadooping
... View more
10-26-2020
10:32 AM
1 Kudo
@sgovi Can you confirm you have only one network card enabled? Please share the output of the below command from the web-shell CLI $ ifconfig Please revert
... View more
10-18-2020
03:26 AM
@cbfr Before you decide whether the cluster has corrupt files can you check the replication factor? If it's set to 2 then that normal The help option will give you a full list of sub commands $ hdfs fsck / ? fsck: can only operate on one path at a time '?' The list of sub command options $ fsck <path> [-list-corruptfileblocks | [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks | -replicaDetails | -upgradedomains]]]] [-includeSnapshots] [-showprogress] [-storagepolicies]
[-maintenance] [-blockId <blk_Id>]
start checking from this path
-move move corrupted files to /lost+found
-delete delete corrupted files
-files print out files being checked
-openforwrite print out files opened for write
-includeSnapshots include snapshot data of a snapshottable directory
-list-corruptfileblocks print out list of missing blocks and files they belong to
-files -blocks print out block report
-files -blocks -locations print out locations for every block
-files -blocks -racks print out network topology for data-node locations
-files -blocks -replicaDetails print out each replica details
-files -blocks -upgradedomains print out upgrade domains for every block
-storagepolicies print out storage policy summary for the blocks
-maintenance print out maintenance state node details
-showprogress show progress in output. Default is OFF (no progress)
-blockId print out which file this blockId belongs to, locations (nodes, racks) It would be good to first check for corrupt files and then run the delete $ hdfs fsck / -list-corruptfileblocks
Connecting to namenode via http://mackenzie.test.com:50070/fsck?ugi=hdfs&listcorruptfileblocks=1&path=%2F
---output--
The filesystem under path '/' has 0 CORRUPT files A simple demo here my replication factor is 1 see above screenshot when I create a new file in hdfs the default repélication factor is set to 1 $ hdfs dfs -touch /user/tester.txt Now to check the replication fact see the number 1 before the group:user hdfs:hdfs $ hdfs dfs -ls /user/tester.txt
-rw-r--r-- 1 hdfs hdfs 0 2020-10-18 10:21 /user/tester.txt Hope that helps
... View more
10-17-2020
02:55 PM
1 Kudo
@sgovi I have just downloaded a Virtual box sandbox image and imported it into the VirtualBox successfully. In my configuration, I enabled only one network card Bridge Adapter so it picks the IP from my LAN of 192.168.0.x After uncompressing the Docker image, the initial screen shows it picked my local LAN IP which I used to access the browser CLI as shown below. Make sure you update your windows hosts file in C:\Windows\System32\drivers\etc Using the above URL change the initial root and Ambari passwords see steps I completed the below steps changing the initial root password Hadoop and then reset the Ambai user password. Once that is successful is starts the Ambari server sandbox-hdp login: root
root@sandbox-hdp.hortonworks.com's password:
You are required to change your password immediately (root enforced)
Last login: Sat Oct 17 20:21:47 2020
Changing password for root.
(current) UNIX password:
New password: Ambari user password reset steps ambari-admin-password-reset
Please set the password for admin:
Please retype the password for admin:
The admin password has been set.
Restarting ambari-server to make the password change effective...
Using python /usr/bin/python
Restarting ambari-server
Waiting for server stop...
Ambari Server stopped
Ambari Server running with administrator privileges.
Organizing resource files at /var/lib/ambari-server/resources...
Ambari database consistency check started...
Server PID at: /var/run/ambari-server/ambari-server.pid
Server out at: /var/log/ambari-server/ambari-server.out
Server log at: /var/log/ambari-server/ambari-server.log
Waiting for server start................................
Server started listening on 8080
DB configs consistency check: no errors and warnings were found. Using the Local IP given to the VirtualBox from my LAN,I could access Ambari with the new password I reset above and restarted all serive though some were running. Can you confirm you followed those steps and still failed?? Happy hadooping
... View more
10-16-2020
01:47 PM
@bhoken In a kerberized cluster the Kafka ACL is leveraged by Ranger,if the Kafka plugin is enabled don't look further than the Ranger Please share you Ranger-Kafka policy
... View more
10-10-2020
02:50 PM
@kumarkeshav The parameter you are looking for hbase.regionserver.global.memstore.size is found in the hbase-site.xml I don't know where this value can be separately edit or centrally by Ambari
... View more
10-10-2020
02:08 PM
1 Kudo
@mike_bronson7 Once you connect the 10 new data nodes to the cluster Ambari automatically distributes the common hadoop config file i.e hdfs-site.xml,Mapred-site.xml,yarn-site.xml etc to those new nodes so they can start receiving data blocks. My suggestion as a workaround would be to add these 10 new datanodes hostnames FQDN or IP (separated by a newline character) in the dfs.exclude file on the NameNode machine, edit the <HADOOP_CONF_DIR>/dfs.exclude file and where <HADOOP_CONF_DIR> is the directory for storing the Hadoop configuration files. For example, /etc/hadoop/conf. First, ensure the DNS resolution is working or your /etc/hosts are updated and the passwordless connection is working with those hosts. Once the 10 new nodes are in the dfs.exclude file the namenode will consider them as bad nodes so no data will be replicated to them as long as these hosts remain in the dfs.exclude file once you have updated the NameNode with the new set of excluded DataNodes. On the NameNode host machine, execute the following command: su <HDFS_USER>
hdfs dfsadmin -refreshNodes where <HDFS_USER> is the user owning the HDFS services That should do the trick, once these hosts are visible in Ambari turn maintenance mode on so you don't receive any alerts The day you will decide to add/enable these 10 new datanodes you will simply cp or mv the dfs.exclude to dfs.include file located <HADOOP_CONF_DIR>/dfs.include these nodes will start heartbeating and notifying the NameNode that thes DataNodes are ready to start receiving files and participating in the data distribution in the cluster. On the NameNode host machine remember to execute the following command: su <HDFS_USER>
hdfs dfsadmin -refreshNodes Don't forget to disable Maintenance mode on the new datanodes and remove them from dfs.exclude file if you didn't rename or delete it. Run the HDFS Balancer a tool for balancing the data across the storage devices of an HDFS cluster. sudo -u hdfs hdfs balancer The above balancer command has a couple of options either threshold or again the dfs.include and dfs.exclude see explanation below Include and Exclude Lists When the include list is non-empty, only the datanodes specified in the list are balanced by the HDFS Balancer. An empty include list means including all the datanodes in the cluster. The default value is an empty list. [-include [-f <hosts-file> | <comma-separated list of hosts>]] The datanodes specified in the exclude list are excluded so that the HDFS Balancer does not balance those datanodes. An empty exclude list means that no datanodes are excluded. When a datanode is specified in both in the include list and the exclude list, the datanode is excluded. The default value is an empty list. [-exclude [-f <hosts-file> | <comma-separated list of hosts>]] If no dfs.include file is specified, all DataNodes are considered to be included in the cluster (unless excluded explicitly in the dfs.exclude file). The dfs.hosts and dfs.hosts.exclude properties in hdfs-site.xml are used to specify the dfs.include and dfs.exclude files. Hope that helps
... View more
10-10-2020
11:35 AM
1 Kudo
@mike_bronson7 Always stick to the Cloudera documentation. Yes !!! there is no risk in running that command I can understand your reservation.
... View more
10-10-2020
11:31 AM
@lxs I have helped resolve this kind of issue a couple of times. Can you help with screenshots of your configuration of the sandbox? Memory Network Splash screen After restarting the sandbox and any other screenshot you deem important
... View more