Member since
07-15-2014
57
Posts
9
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6657 | 06-05-2015 05:09 PM | |
1578 | 12-19-2014 01:03 PM | |
3123 | 12-17-2014 08:23 PM | |
7953 | 12-16-2014 03:07 PM | |
13386 | 08-30-2014 11:14 PM |
01-01-2015
12:15 PM
My organization is a services and consulting company. I am reading this PDF http://blog.cloudera.com/wp-content/uploads/2011/08/Cloudera-Connect-Program-Overview.pdf but upon reading this its still not clear to me that what is the benefit for a services and consulting company to become a cloudera service provider. is there some other material on this? ** Please forgive if this is not the right place to post this. I could not find any forum which was suitable to post this question. **
... View more
12-31-2014
11:18 AM
3 Kudos
I have often heard the term called "Edge Node" in hadoop architecture discussions. However when I created a cluster using the cloudera manager. it created a fully functional cluster for me without making any node as the edge node. so my question is that if I want to follow the best practise of having an edge node and I am using cloudera manager to create a cluster. How do I create an edge node?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Cloudera Manager
12-29-2014
05:09 PM
Hello, I have been tasked with creating 3 environments for my team. dev, ua and prod. I have two choices. 1. I create 3 completely separate intances of cloudera manager and build 3 completely separate environments of CDH. The complete isolation of these 3 environments would give me maximum flexibility. 2. I build only one cloudera manager instance and then build 3 clusters on top of it. these 3 clusters can have different security policies. option 2 would be quite efficient because I can manage and maintain all 3 from 1 place. However suppose I want to upgrade and test how cloudera 5.4 will work, then in option 1 this is easy to do. Since my dev is totally separate I can do whatever I want with it and test. However I am wondering how does this work in option 2. so if I have 3 clusters on the same cloudera manager .... can I just upgrade 1 cluster to say CDH 5.6 independantly? Also... which approach does the community recommend?
... View more
Labels:
12-24-2014
02:57 PM
I had to delete the directories in HDFS manually It could be that kite-dataset delete command only does a logical delete. this means it only removes the metadata. anyways. doing kite-dataset delete and then a manual delete in HDFS works for me.
... View more
12-24-2014
10:35 AM
From my 2nd machine in the cluster (hd1home) it connects fine [root@cm1home ~]# ssh hd1home Last login: Sat Dec 20 22:14:43 2014 from abhishek-pc.home [root@hd1home ~]# impala-shell Starting Impala Shell without Kerberos authentication Connected to hd1home:21000 Server version: impalad version 2.0.1-cdh5 RELEASE (build cc09df07d93c79351052eef2f4e5baf52373dc33) Welcome to the Impala shell. Press TAB twice to see a list of available commands. Copyright (c) 2012 Cloudera, Inc. All rights reserved. (Shell build version: Impala Shell v2.0.1-cdh5 (cc09df0) built on Wed Nov 19 10:57:34 PST 2014) [hd1home:21000] > So why doesn't this work on all machines in the cluster? I thought this could be because the impala services are not running on cm1home. but when I try to do start services, on cm1home it says these services are not recognized [root@cm1home ~]# sudo service impala-state-store start impala-state-store: unrecognized service [root@cm1home ~]# sudo service impala-server start impala-server: unrecognized service [root@cm1home ~]# sudo service impala-catalog start impala-catalog: unrecognized service [root@cm1home ~]#
... View more
12-24-2014
10:30 AM
I have deployed CDH 5.2.x. all services are working fine. I am able to run mapreduce jobs. I can access hbase, hive, hue etc. when I run impala-shell I get an error Error connecting: TTransportException, Could not connect to cm1home:21000 1. I can do uname -a on all machines. I an do python -c 'import socket; print socket.getfqdn(),socket.gethostbyname(socket.getfqdn())' successfully on all machines iptables is stopped and off on all machines. in my cloudera manager console. I can see that my impala service is healthy. in the log file I do not see any errors ... but these are the last few lines 10:24:49.088 PMINFOinit.cc:136 Using hostname: cm1home 10:24:49.088 PMINFOlogging.cc:100 Flags (see also /varz are on debug webserver):
--catalog_service_port=26000
.... 10:24:49.088 PMINFOwebserver.cc:154 Starting webserver on 0.0.0.0:25010 10:24:49.088 PMINFOwebserver.cc:168 Document root: /opt/cloudera/parcels/CDH-5.2.1-1.cdh5.2.1.p0.12/lib/impala 10:24:49.088 PMINFOwebserver.cc:234 Webserver started 10:24:49.141 PMINFOthrift-server.cc:387 ThriftServer 'StatestoreService' started on port: 24000 10:24:49.394 PMINFOstatestore.cc:314 Creating new topic: ''impala-request-queue' on behalf of subscriber: 'impalad@hd1home:22000 10:24:49.394 PMINFOstatestore.cc:314 Creating new topic: ''catalog-update' on behalf of subscriber: 'impalad@hd1home:22000 10:24:49.394 PMINFOstatestore.cc:314 Creating new topic: ''impala-membership' on behalf of subscriber: 'impalad@hd1home:22000 10:24:49.394 PMINFOstatestore.cc:321 Registering: impalad@hd1home:22000 10:24:49.396 PMINFOclient-cache.cc:107 CreateClient(): creating new client for hd1home:23000 10:24:49.396 PMINFOstatestore.cc:352 Subscriber 'impalad@hd1home:22000' registered (registration id: 8249428724f657e9:4bf0ac722e4c1e9d) 10:24:49.541 PMINFOstatestore.cc:321 Registering: impalad@hd2home:22000 10:24:49.542 PMINFOstatestore.cc:479 Preparing initial impala-membership topic update for impalad@hd2home:22000. Size = 100.00 B 10:24:49.542 PMINFOclient-cache.cc:107 CreateClient(): creating new client for hd2home:23000 10:24:49.548 PMINFOstatestore.cc:352 Subscriber 'impalad@hd2home:22000' registered (registration id: 3f4069d5f5910464:c9836fd7e24ca2b8) 10:24:49.902 PMINFOstatestore.cc:479 Preparing initial impala-membership topic update for impalad@hd1home:22000. Size = 100.00 B 10:24:50.131 PMINFOstatestore.cc:321 Registering: impalad@hd4home:22000 10:24:50.131 PMINFOstatestore.cc:479 Preparing initial impala-membership topic update for impalad@hd4home:22000. Size = 100.00 B 10:24:50.131 PMINFOclient-cache.cc:107 CreateClient(): creating new client for hd4home:23000 10:24:50.132 PMINFOstatestore.cc:352 Subscriber 'impalad@hd4home:22000' registered (registration id: bb46f6d07ad5482b:bec763db82f31c8e) 10:24:50.547 PMINFOstatestore.cc:479 Preparing initial impala-membership topic update for impalad@hd2home:22000. Size = 200.00 B 10:24:54.770 PMINFOstatestore.cc:321 Registering: impalad@hd3home:22000 10:24:54.774 PMINFOstatestore.cc:352 Subscriber 'impalad@hd3home:22000' registered (registration id: 834038f459d1cfe2:d18846305f34a4b6) 10:24:54.784 PMINFOstatestore.cc:479 Preparing initial impala-membership topic update for impalad@hd3home:22000. Size = 300.00 B 10:24:54.784 PMINFOclient-cache.cc:107 CreateClient(): creating new client for hd3home:23000 10:24:56.072 PMINFOstatestore.cc:321 Registering: catalog-server@cm1home:26000 10:24:56.072 PMINFOclient-cache.cc:107 CreateClient(): creating new client for cm1home:23020 10:24:56.072 PMINFOstatestore.cc:352 Subscriber 'catalog-server@cm1home:26000' registered (registration id: 2744bfbe34bc0106:ba0fbae77b48ca8e) 10:24:56.653 PMINFOstatestore.cc:479 Preparing initial catalog-update topic update for impalad@hd4home:22000. Size = 372.00 B 10:24:56.791 PMINFOstatestore.cc:479 Preparing initial catalog-update topic update for impalad@hd3home:22000. Size = 372.00 B 10:24:56.923 PMINFOstatestore.cc:479 Preparing initial catalog-update topic update for impalad@hd1home:22000. Size = 372.00 B 10:24:57.063 PMINFOstatestore.cc:479 Preparing initial catalog-update topic update for impalad@hd2home:22000. Size = 372.00 B 10:24:57.083 PMINFOstatestore.cc:479 Preparing initial catalog-update topic update for catalog-server@cm1home:26000. Size = 372.00 B
... View more
12-23-2014
02:35 PM
I started learning kite sdk today and am finding some strange behaviour. so I had created this "users" dataset before which I want to delete and re-create. to do this I do the following steps 1. kite-sdk delete users In order to confirm if the delete is successful I do a kite-sdk show users Hive table no found: default.users OK so great users is deleted. Now I do a kite-dataset create "users" --schema users.avsc kite-dataset show users and I get to see a bunch of records.... but wait... I never imported any data yet. so how come kite is showing me data in the users schema? as of now I have just created the schema (the step of kite-dataset csv-import) has not been done yet. so where did the data come from?? is it the data from the previous users table?? Where is this data coming from? I have not imported anything yet....
... View more
Labels:
- Labels:
-
Apache Hive
12-19-2014
01:03 PM
1 Kudo
I figured out. there is a directory called /dfs I deleted this directory on each machine and then retried. it worked perfectly
... View more
12-19-2014
10:02 AM
I created a cluster successfully and then deleted it. Now when I try to create a new cluster. After I have chosen all the roles and hosts, it fails at the 3rd step where it is trying to format the name node. so it seems the data from the previous cluster is already there and that is why the new one is not able to format it. Now do I remove old data?
... View more
12-18-2014
01:17 PM
Actually I did install all the agents manually. but when I start up the http://cn1work:7180 it insists on installing it. for JDK it gives me a checkbox to say yes/no for installation but for agent it just begins to install.
... View more