Member since
07-31-2013
1924
Posts
462
Kudos Received
311
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1953 | 07-09-2019 12:53 AM | |
| 11791 | 06-23-2019 08:37 PM | |
| 9080 | 06-18-2019 11:28 PM | |
| 10035 | 05-23-2019 08:46 PM | |
| 4445 | 05-20-2019 01:14 AM |
11-09-2015
08:20 AM
Thanks. That was my issue. The nodes are balanced in terms of DFS Used%, even though the amount raw bytes are varied.
... View more
11-08-2015
11:51 PM
If by 'hard to analyse' you mean to parse/process it, you can consider using the Java API to fetch block location info too: http://archive.cloudera.com/cdh5/cdh/5/hadoop/api/org/apache/hadoop/fs/FileSystem.html#getFileBlockLocations(org.apache.hadoop.fs.Path,%20long,%20long)
... View more
11-08-2015
07:27 AM
2 Kudos
Your understanding seems right, but note that none of the 'splitting' is automatic. At its simplest form, federation is a way to have multiple distinct NameNodes powered by a common set of DataNodes. Effectively, its running and managing 2 or more *separate* namespaces on top of the same storage space. If you deploy two federated NameNodes, say hdfs://host-nn1/ and hdfs://host-nn2, then they will have nothing in common except the Live DN hostnames they share. A 'hadoop fs -ls' done on each will return absolutely independent results.
... View more
11-04-2015
12:52 AM
Thanks, will let you know. Bye
... View more
10-26-2015
01:14 AM
1 Kudo
Finally, I get it done, so ,I post my steps ,maybe it will be helpful for someone like me who happen to have the same problem. I use cdh5.3.1,the main steps is : At first ,I recreate a new cdh manager, and reconfigure all parameters and roles in this new cdh manager, t hen add all process_id in processes table in scm db, and then modify /etc/cloudera-scm-agent/config.ini server_host to this new manager and restart all agent . At first, we should backup ,prepare for the worst. 1 cdh provided two ways to backup,backup database or backup config to a json file 1.1 backup database : http://www.cloudera.com/content/www/en-us/documentation/enterprise/5-2-x/topics/cm_ag_backup_dbs.html for example: backup : pg_dump -h localhost -p 7432 -U scm -W -F c -b -v -f "scm_db.db" scm restore: pg_restore -p 7432 -U scm -W -d scm -v scm_db.db 1.2 write config to a json file http://www.cloudera.com/content/www/en-us/documentation/enterprise/5-3-x/topics/cm_intro_api.html#xd_583c10bfdbd326ba--7f25092b-13fba2465e5--7f20 for example: export: curl -u admin:admin "http://localhost:8888/api/v9/cm/deployment" > ~/cmf_config.json import: curl --upload-file ~/cmf_config.json -u admin:admin http://localhost:8888/api/v9/cm/deployment?deleteCurrentDeployment=true If we did not backup and lost our database,well,it is hard to restore, however ,it can be done. The following is my steps: 1 Reinstall a new cdh manager on another machine with different host name. 2 Export a json configuration file from another currently working cdh manager with same cdh manager version. This cdh manager should include all service(such as hdfs,yarn,hbase,hdfs HA ,etc). 3 Run a script on all machine to get hostid and hostname, the hostid is in a file: /var/lib/cloudera-scm-agent/uuid 4 Modify the json file from step 2,do the following: 4.1 Delele all hosts in this json file and add all hosts's hostid and hostname from step 3. 4.2 Delete all roles in clusters's services. 4.3 Modify cluster name to old cluster id. you can get the old cluster id from hdfs namenode http webpage if you do not remember you old cluster id. 5 Import the new json file into the new create cdh manager. 6 Reconfigure all service's parameter and add all instance to service as before.(you should not use host template cause the agent is not report this new cmf server,however, you can add instances from service ). 7 If you have hdfs HA enabled, you have to export json file and add hdfs ha roles in that file, and import to it again. 8 If you do not want to stop all the services, you have to do the following steps to get all process_id from all hosts, however ,if you can stop services, you can jump to step 11; 9 Run the following script on all hosts to get process_id and services name. grep "spawned:.*with pid" /var/log/cloudera-scm-agent/supervisord.log |awk -vhost=$HOSTNAME '{ idx=index($5,"-"); name=substr($5,idx+1,length($5)-idx-1);pid=substr($5,2,idx-2);cc[name]=pid;}END{a=host;for (b in cc){ a=a"\t"b"\t"cc[b]} print a}' 10 Parese that file and insert a record in scm database' processes table , before we insert to processes table, you have to insert another record in commands table to get a new command_id .( this step is hard). 11 Modify all /etc/cloudera-scm-agent/config.ini server_host to new cdh manager , kill cmf listener and restart all cmf agent. Everything should be fine , the agent will report to new cdh server ,and the cdh server will return the same process_id to agent, so the running process will not be killed.
... View more
10-15-2015
08:21 AM
Thanks for valuable information . I have fixed the above issue by changing the MapReduce Service property value in hive configuration file to Yarn.
... View more
10-14-2015
08:05 PM
I am able to connect to IBM MQ using the steps mentioned here. But when Flume is trying to consume any messages from the Q, its throwing following exception. com.ibm.msg.client.jms.DetailedMessageFormatException: JMSCC0053: An exception occurred deserializing a message, exception: 'java.lang.ClassNotFoundException: null class'. It was not possible to deserialize the message because of the exception shown. 1) I am using all the ibm mq client jars. Flume is starting with out any exception. But exception is coming when trying to consume the messages . 2) I am putting a custom message [Serializable object] into Q which Flume need to consume. 3) Flume 1.5.0-cdh5.4.1 4) MQ Version 8.x a1.sources=fe_s1 a1.channels=c1 a1.sinks=k1 a1.sources.s1.type=jms a1.sources.s1.channels=c1 a1.sources.s1.initialContextFactory=com.sun.jndi.fscontext.RefFSContextFactory a1.sources.s1.connectionFactory=FLUME_CF a1.sources.s1.destinationName=MY.Q a1.sources.s1.providerURL=file:///home/JNDI-Directory a1.sources.s1.destinationType=QUEUE a1.sources.s1.transportType=1 a1.sources.s1.userName=mqm a1.sources.s1.batchSize=1 a1.channels.c1.type=memory a1.channels.c1.capacity=10000 a1.channels.c1.transactionCapacity=100 a1.sinks.k1.type=logger a1.sinks.k1.channel=c1
... View more
10-09-2015
08:12 AM
1 Kudo
Thank you! I actually just an hour ago came across that solution when reading this, so I already implemented the solution you suggested, but was still waiting to see if it'd balance out evenly before I posted: http://www.slideshare.net/cloudera/hadoop-troubleshooting-101-kate-ting-cloudera Regardless, thank you very much for your help in resolving this issue.
... View more
10-09-2015
02:00 AM
Hi Harsh, I have solved this issue. I think the problem is related with permissions. And solution is the agent should be started by sudo even by user root. like: # sudo ./cloudera-scm-agent start Then the distributing go smoothly. Thank you for your tips of running 'curl'.
... View more