Member since
07-29-2013
162
Posts
8
Kudos Received
7
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
7142 | 05-06-2015 06:52 AM | |
3094 | 06-09-2014 10:51 PM | |
5075 | 01-30-2014 10:40 PM | |
3725 | 08-22-2013 12:28 AM | |
5133 | 08-18-2013 11:23 PM |
10-10-2013
12:42 AM
We are now migrating to 4.4, thanks, we'll check. We are still in progress, I'll report later how thing go on.
... View more
10-09-2013
04:27 AM
Hi, suddenly we've clicked "GET /oozie/list_workflows/ HTTP/1.1" And got exception. What do we do wrong? 19:19:54.000 INFO access 10.100.141.164 devops - "GET /oozie/list_workflows/ HTTP/1.1" 19:19:54.000 INFO middleware Processing exception: User matching query does not exist.: Traceback (most recent call last): File "/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/share/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/handlers/base.py", line 100, in get_response response = callback(request, *callback_args, **callback_kwargs) File "/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/share/hue/apps/oozie/src/oozie/views/editor.py", line 70, in list_workflows 'json_jobs': json.dumps(list(data.values_list('id', flat=True))), File "/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/share/hue/desktop/core/src/desktop/lib/django_util.py", line 221, in render **kwargs) File "/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/share/hue/desktop/core/src/desktop/lib/django_util.py", line 144, in _render_to_response return django_mako.render_to_response(template, *args, **kwargs) File "/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/share/hue/desktop/core/src/desktop/lib/django_mako.py", line 117, in render_to_response return HttpResponse(render_to_string(template_name, data_dictionary), **kwargs) File "/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/share/hue/desktop/core/src/desktop/lib/django_mako.py", line 106, in render_to_string_normal result = template.render(**data_dict) File "/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/share/hue/build/env/lib/python2.6/site-packages/Mako-0.7.2-py2.6.egg/mako/template.py", line 412, in render return runtime._render(self, self.callable_, args, data) File "/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/share/hue/build/env/lib/python2.6/site-packages/Mako-0.7.2-py2.6.egg/mako/runtime.py", line 766, in _render **_kwargs_for_callable(callable_, data)) File "/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/share/hue/build/env/lib/python2.6/site-packages/Mako-0.7.2-py2.6.egg/mako/runtime.py", line 798, in _render_context _exec_template(inherit, lclcontext, args=args, kwargs=kwargs) File "/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/share/hue/build/env/lib/python2.6/site-packages/Mako-0.7.2-py2.6.egg/mako/runtime.py", line 824, in _exec_template callable_(context, *args, **kwargs) File "/tmp/tmp6o407q/oozie/editor/list_workflows.mako.py", line 219, in render_body __M_writer(escape(unicode( workflow.owner.username ))) File "/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/share/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/db/models/fields/related.py", line 302, in __get__ rel_obj = QuerySet(self.field.rel.to).using(db).get(**params) File "/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/share/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/db/models/query.py", line 341, in get % self.model._meta.object_name) DoesNotExist: User matching query does not exist.
... View more
Labels:
10-07-2013
01:00 AM
Hi, each day we will get 10-20 GB of binary files. We need to upload these files into HDFS. Also we want to limit access to cluster from client side (side which delivers 10-20GB files) What are the best approaches? We have several ideas: 1. SFTP on our side (for example one of our data-nodes) and then hadoop fs -put 2. hadoop fs -put from client side (who delivers data). But we would like to forbid direct remote access to cluster. 3. WebHDFS (is it working???) the problem is the same, we don't want give access to cluster or its interface to the client. *And we don't want to establish kerberos or stuff like that, we have private secure network for the cluster.
... View more
Labels:
- Labels:
-
HDFS
09-29-2013
10:04 PM
Thank you!
... View more
09-29-2013
10:54 AM
Starting Impala Shell in unsecure mode Connected to node05.kyc.megafon.ru:21000 Server version: impalad version 1.1.1 RELEASE (build 83d5868f005966883a918a819a449f636a5b3d5f) Welcome to the Impala shell. Press TAB twice to see a list of available commands. Copyright (c) 2012 Cloudera, Inc. All rights reserved. (Shell build version: Impala Shell v1.1.1 (83d5868) built on Fri Aug 23 17:28:05 PDT 2013) It works: select cnt, msisdn from( select count(*) as cnt, msisdn from ZZZ_ROUTES_MSK group by msisdn ) t where cnt > 50 and cnt < 100 order by cnt desc limit 10; It doesn't work: select count(*) as cnt, msisdn from ZZZ_ROUTES_MSK where cnt > 50 and cnt < 100 group by msisdn order by cnt desc limit 10; Query: select count(*) as cnt, msisdn from ZZZ_ROUTES_MSK group by msisdn where cnt > 50 and cnt < 100 order by cnt desc limit 10 [localhost:21000] > select count(*) as cnt, msisdn > from ZZZ_ROUTES_MSK > > where cnt > 50 and cnt < 100 > > group by msisdn > order by cnt desc limit 10; Query: select count(*) as cnt, msisdn from ZZZ_ROUTES_MSK where cnt > 50 and cnt < 100 group by msisdn order by cnt desc limit 10 ERROR: AnalysisException: couldn't resolve column reference: 'cnt' Why?
... View more
Labels:
- Labels:
-
Apache Impala
09-28-2013
09:47 AM
Reply by email didn't work so I coy-paste my second question using web-form: ---Insert Reply Above This Line--- So, the solution is: Issue "invalidate metadata" on target impalad node before quering partitioned table?
... View more
09-26-2013
04:00 AM
Hi, we use Impala 1.1.1 We use JDBC to submit query to Impala. We submit query to partitioned table. Oozie coordinator does wake up each hour, parses data and adds new patition to table: alter table add partition... When we do get connection JDBC to random Impalad (chosen from predefined list) in our Java code, we do submit "refresh my_partitioned_table" and then give connection to user. We suppose that "refresh some_Table" forces Imapad to get actual metadata info about target table from metastore. But it's not true. Here it is: 1. Let's refresh metadata Returned 0 row(s) in 3.38s [localhost:21000] > refresh web_resource_rating; Query: refresh web_resource_rating Query finished, fetching results ... 2. Let's check the result. We query "virtual partition" column localhost:21000] > select distinct fulldate from web_resource_rating order by fulldate desc limit 20; Query: select distinct fulldate from web_resource_rating order by fulldate desc limit 20 Query finished, fetching results ... +---------------+ | fulldate | +---------------+ | 2013-09-25-10 | | 2013-09-25-09 | | 2013-09-25-08 | | 2013-09-25-07 | | 2013-09-25-06 | | 2013-09-25-05 | | 2013-09-25-04 | | 2013-09-25-03 | | 2013-09-25-02 | | 2013-09-25-01 | | 2013-09-25-00 | | 2013-09-24-23 | | 2013-09-24-22 | | 2013-09-24-21 | | 2013-09-24-20 | | 2013-09-24-19 | | 2013-09-24-18 | | 2013-09-24-17 | | 2013-09-24-16 | | 2013-09-24-15 | +---------------+ OMG, Where is last day???? Let's invalidate metadata (I've already did it last day...) [localhost:21000] > invalidate metadata; Query: invalidate metadata Query finished, fetching results ... Le'ts repeat the query Query: select distinct fulldate from web_resource_rating order by fulldate desc limit 20 Query finished, fetching results ... +---------------+ | fulldate | +---------------+ | 2013-09-26-13 | | 2013-09-26-12 | | 2013-09-26-11 | | 2013-09-26-10 | | 2013-09-26-09 | | 2013-09-26-08 | | 2013-09-26-07 | | 2013-09-26-06 | | 2013-09-26-05 | | 2013-09-26-04 | | 2013-09-26-03 | | 2013-09-26-02 | | 2013-09-26-01 | | 2013-09-26-00 | | 2013-09-25-23 | | 2013-09-25-22 | | 2013-09-25-21 | | 2013-09-25-20 | | 2013-09-25-19 | | 2013-09-25-18 | +---------------+ Great now I see the latest partitions. What do I do wrong? Why refresh doesn't help?
... View more
Labels:
- Labels:
-
Apache Impala
-
Apache Oozie
08-22-2013
11:29 PM
We use CDH 4.3, CM 4.6.3 The first thing we've checked was metastore service. It works. impala queries, other hive queries run fine. All our "SQL-like" stuff uses single metastore in PostGre DB. I've also see how does hive tried to get partitions for table used in failed query. There no exceptions on Metastore side, it works 100% We've mentioned one more intersting thing. Queried table have ~1500 partitions (hour partitions). When we issue the same query explicitly specifing the range partition, it works. Even if we specify the range between the oldest and the newest partition (the same as we don't sepcify them at all!) it works. I have no Idea what is that...
... View more
08-22-2013
09:08 AM
Omg. who had set limit for message length? When normal HTML support would be? Google groups does handle Cloudera logs better. We are using CDH 4.3, I don't underastand what we are doing wrong... It happens on several tables, the other tables are fine
... View more
08-22-2013
09:06 AM
19:56:40.570 INFO org.apache.hadoop.hive.ql.ppd.OpProcFactory (pers_id_type) IN ('lol', 'bob', 'tom') 19:57:00.597 WARN org.apache.hadoop.hive.metastore.RetryingMetaStoreClient MetaStoreClient lost connection. Attempting to reconnect. org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partitions_with_auth(ThriftHiveMetastore.java:1391) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partitions_with_auth(ThriftHiveMetastore.java:1374) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsWithAuthInfo(HiveMetaStoreClient.java:692) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74) at $Proxy9.listPartitionsWithAuthInfo(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:1565) at org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:202) at org.apache.hadoop.hive.ql.optimizer.pcr.PcrOpProcFactory$FilterPCR.process(PcrOpProcFactory.java:112) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:87) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:124) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:101) at org.apache.hadoop.hive.ql.optimizer.pcr.PartitionConditionRemover.transform(PartitionConditionRemover.java:86) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:102) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8200) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:457) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:349) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:355) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:95) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:76) at org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:114) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:194) at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:154) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:190) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1193) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1178) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.cli.thrift.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:38) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) ... 43 more 19:57:01.598 INFO hive.metastore Trying to connect to metastore with URI thrift://prod-beeswax.lol.ru:9083 19:57:01.600 INFO hive.metastore Waiting 1 seconds before next connection attempt. 19:57:02.600 INFO hive.metastore Connected to metastore. 19:57:22.621 ERROR hive.ql.metadata.Hive org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partitions_with_auth(ThriftHiveMetastore.java:1391)
... View more
Labels:
- « Previous
- Next »