Member since
10-07-2015
16
Posts
6
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1504 | 02-28-2019 03:31 PM | |
29021 | 12-02-2016 03:04 AM | |
4159 | 11-23-2016 08:15 AM |
02-28-2019
03:31 PM
Without the stack trace, we are going to have a hard time to pin down what is going wrong. With 5.3.1 being pretty old, it could easily be a bug. I wonder if this is (top answer) is causing the stack trace to be suppressed: https://stackoverflow.com/questions/2295015/log4j-not-printing-the-stacktrace-for-exceptions Could be worth a NN restart wth "-XX:-OmitStackTraceInFastThrow" added to the JVM options to see if we get a stack trace out, if you are able to schedule a restart. It your key concern is getting the missing blocks back then you should be able to copy the block file and its corresponding meta file to another DN. The block will be in a given subfolder under its disk, eg: /data/hadoop-data/dn/current/BP-1308070615-172.22.131.23-1533215887051/current/finalized/subdir0/subdir0/blk_1073741869 In that example "subdir0/subdir0" - it does not matter which node you copy it to or which disk, but you must ensure the sub folders are maintained. Then restart the target DN and see if the block moves from missing to not missing when it checks in. I'd suggest trying this with one block to start with to ensure it works OK.
... View more
02-28-2019
03:37 AM
Do you get a full strack trace in the Namenode log at the time of the error in the datanode? From the message in the DN logs, it is the NN that is throwing the NullPointerException, so there appears to be something in the block report it does not like. Have all these files with missing blocks got replication factor of 1 or have they a replication factor 3?
... View more
02-09-2017
06:59 AM
You can put the s3 credentials in the s3 URI, or you can just pass the parameters on the command line, which is what I prefer, eg: hadoop fs -Dfs.s3a.access.key="" -Dfs.s3a.secret.key="" -ls s3a://bucket-name/ Its also worth knowing that if you run the command like I have given above, it will override any other settings that are defined in the cluster config, such as core-site.xml etc.
... View more
01-11-2017
09:16 AM
Hi Leo, Ok - its good that the Kerberos setup is all working, as that can be the hard part! If you want to copy files such that you run the command on cluster1 then you will need to modify the cluster1 hdfs-site.xml to hold the information about the other cluster. To do this, you can list a new nameservice in the hdfs-site.xml on cluster1 that points at the Namenode hosts of the other cluster. The nameservice on cluster1 is nameservice1. Lets create another nameservice called backupcluster. In the hdfs-site.xml on cluster1 add properties for the following: dfs.ha.namenodes.backupcluster=nn1,nn2
dfs.namenode.rpc-address.backupcluster.nn1=cluster2_nn_host1:8020
dfs.namenode.rpc-address.backupcluster.nn2=cluster2_nn_host2:8020
dfs.client.failover.proxy.provider.backupcluster=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider Assuming you are using CM, you can edit the /etc/hadoop/conf/hdfs-site.xml on your gateway node to test this out, but afterwards you should add the settings to the CM Cluster wide safety value for hdfs-site.xml. If the above settings worked, from cluster1, you should be able to run: hadoop fs -ls hdfs://backupcluster/some/path If that works, you can try distcp from cluster1: hadoop distcp hdfs://nameservice1/path/on/cluster1 hdfs://backupcluster/target/path
... View more
01-09-2017
09:11 AM
Assuming all the Kerberos trust relationships are setup correctly, it is possible to copy between two secured HA clusters with the same namespace. The following documentation talks about the Kerberos configuration required: https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_admin_distcp_data_cluster_migrate.html#concept_hcs_srr_sr The easiest way to do the copy, is to identify the active namenode of the target cluster, and run the command like: hadoop distcp /path/to/source/file hdfs://namenode_host:8020/destination_path Ie, intead of using the nameservice name, use the actual hostname of the active namenode.
... View more
12-02-2016
03:04 AM
2 Kudos
Can you check your /etc/hadoop/conf/yarn-site.xml and ensure the following two parameters are set: <property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
... View more
11-30-2016
01:26 PM
1 Kudo
Looks like the history directory permissions are wrong. Can you try running this command to reset the permissions and then try running the job again: sudo -u hdfs hadoop fs -chmod -R 1777 /user/history
... View more
11-30-2016
05:57 AM
You could try adding the following to the bottom of the yarn-site.xml: <property>
<description>Minimum allocation unit.</description>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>256</value>
</property> Also, set the nodemanager memory a bit higher and the vcores a bit higher: yarn.nodemanager.resource.memory-mb to 1536 yarn.nodemanager.resource.cpu-vcores to 4 And then add the following to the end of the mapred-site.xml and see if it gives better results: <property>
<name>mapreduce.map.memory.mb</name>
<value>256</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>256</value>
</property>
<property>
<description>Application master allocation</description>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>256</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx204m</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx204m</value>
</property>
<property>
<description>Application Master JVM opts</description>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Xmx204m</value>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>50</value>
</property> Note that it would be a good idea to make a copy of the original mapred-site and yarn-site before making these changes. After changing the settings reboot the Quickstart VM so the settings take effect.
... View more
11-30-2016
03:44 AM
It sounds like the jaas.conf is not being picked up. Could you try passing the full path to the jaas.conf to see if that works better, eg: -Djava.security.auth.login.config=/full/path/to/jaas.conf
... View more
11-30-2016
03:35 AM
1 Kudo
It looks like the nodemanager process is stopped for some reason. Can you try starting it, and also restart the resource manager too: sudo service hadoop-yarn-nodemanager start sudo service hadoop-yarn-resourcemanager restart Then see if your job will run?
... View more
11-30-2016
03:30 AM
Hi, So the memory capacity of your single node is set to 1GB. When you run an oozie job, it always needs 2 containers - one for oozie and another to run the jobs, so it is likely you don't have enough memory allocated to have the job run. We will need to check some of your config settings. In /etc/hadoop/conf/mapred-site.xml, what values are set for: mapreduce.map.memory.mb mapreduce.reduce.memory.mb yarn.app.mapreduce.am.resource.mb mapreduce.map.java.opts mapreduce.reduce.java.opts yarn.app.mapreduce.am.command-opts What memory setting have you for the VM you are running? I am wondering if we can push up the yarn limits a little to let more containers run.
... View more
11-29-2016
06:16 AM
Sorry, I should have said, can you substitute your node_id for the one in my example, ie: yarn node -status taha99:8041
... View more
11-29-2016
04:54 AM
Sounds like there is a chance there are not enough resources available in yarn to run the job. If you run the following command on the CDH host what is the output: yarn node -list If this returns a host, eg: [vagrant@standalone ~]$ yarn node -list
16/11/29 12:50:32 INFO client.RMProxy: Connecting to ResourceManager at standalone/192.168.33.6:8032
Total Nodes:1
Node-Id Node-State Node-Http-Address Number-of-Running-Containers
standalone:42578 RUNNING standalone:8042 Can you take the node-id and check what resources it has: $ yarn node -status standalone:42578
... View more
11-29-2016
04:46 AM
From the screen shot showing "all applications", it seems there are no "active nodes". For yarn applications to run, there is a resource manager, that accepts the jobs and then allocates containers on the node managers on the cluster. In your case it looks like no node managers are running, and therefore the jobs cannot be assigned and start running. Are you running this job on the quickstart VM? Can you check the logs for the node manager and resource manager in /var/log/hadoop-yarn and see if there are any relevant errors there?
... View more
11-28-2016
07:17 AM
If you need to get your data from Oracle to HDFS in *near real time* then a Golden Gate solution is probably the best option. While you may need some additional redo logging for this to work, it probably results in the minimum overall impact on the database. If you can tolerate some lag in the data appearing in HDFS, so you could setup some jobs to use Sqoop to pull data from Oracle at regular intervals. With the correct indexes and a way for Sqoop to identify new records this could be done quite efficiently. Depending on how the applicaiton that writes the data to Oracle is architected, there could also be an option to change that applicaiton to write to Oracle and also write to Flume or Kafka, but that does require significant applicaiton changes which may not be feasible in your case.
... View more
11-23-2016
08:15 AM
2 Kudos
Hi, It you set the balancer bandwidth in Cloudera manager, then when the datanodes are started, they will have that bandwidth seting for balancing operations. However, using the command line it is possible to change the bandwidth while the datanodes or balancer is running, without restarting them. If you do this, you just need to remember that if you restart any datanodes, the bandwidth setting will revert back to that set in Cloudera Manager and you will need to run the command again. To set the balancer bandwidth from the command line and without a restart you can run the following: sudo -u hdfs hdfs dfsadmin -setBalancerBandwidth 104857600 If you have HA setup for HDFS, the above command may fail, and you should check which is the active namenode and run the command like follows (substituting the correct hostname for activeNamenode below): sudo -u hdfs hdfs dfsadmin -fs hdfs://activeNamenode:8020/ -setBalancerBandwidth 104857600 To check this command worked, the following log entries should appear within the Datanode log files: INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand action: DNA_BALANCERBANDWIDTHUPDATE INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Updating balance throttler bandwidth from 10485760 bytes/s to: 104857600 bytes/s.
... View more