Member since
07-01-2015
460
Posts
78
Kudos Received
43
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1365 | 11-26-2019 11:47 PM | |
1312 | 11-25-2019 11:44 AM | |
9526 | 08-07-2019 12:48 AM | |
2195 | 04-17-2019 03:09 AM | |
3524 | 02-18-2019 12:23 AM |
08-30-2018
06:22 AM
1 Kudo
Hi, you have to create an ssl_context before you open the connection and point it to the CA certificate: import ssl
context = ssl.create_default_context(cafile=cmcertpath)
api = ApiResource( cm_host, '7183', username=username, password=password, use_tls=True, ssl_context=context)
... View more
08-30-2018
04:46 AM
poojary_sudth: Thank you, but that is +- the same set of tasks I did before. I found how to make consumer work, it was necessary to add parameter --partition 0: KAFKA_OPTS="-Djava.security.auth.login.config=/root/jaas.conf" kafka-console-consumer --bootstrap-server ourHost:9092 --topic test --consumer.config /root/client.properties --partition 0 I cannot see all the messages comming into the topic, but at least some of them which fall into specified partition are printed. Which is enought for me to confirm that Kafka broker works. I found this hint here: https://stackoverflow.com/questions/34844209/consumer-not-receiving-messages-kafka-console-new-consumer-api-kafka-0-9
... View more
08-30-2018
01:46 AM
1 Kudo
Problem resolved: the issue was with the overall memory consumption on the HiveServer2. The hive job submitted do not run in the HS2 JVM, but as an extra java process and the OS was killing it because it could not allocate such amount of memory.
... View more
08-29-2018
03:57 AM
You can materialize this metric table with 60 rows, and then Impala will broadcast it to every node (or Hive will run a map-side join) so it will not affect the shuffling over network. However the result set will be N times larger, but that's the point of the query right? (produce from 100k with 10 metrics a 1M table)
... View more
08-27-2018
10:01 AM
Hi Tomas, I am using RHEL 7.1
... View more
08-26-2018
10:41 AM
3 Kudos
Dr.who issue is very common these days , i am not sure whos exploiting opensource project or sth. but the main cause is usually a remote shell script would be attached to your resource manager node which cause dr.who to spawn. you dont need to kerberize just use some linux firewall Thanks
... View more
08-24-2018
01:28 AM
Hi, you can inspect the avro files with avro-tools utility. create table work.test_avro ( i int, s string ) stored as avro;
insert into work.test_avro select 1, "abc";
set hive.exec.compress.output = true
set hive.exec.compress.intermediate = true;
set avro.output.codec= snappy;
insert into work.test_avro select 2, "abcdefgb"; In this table there are two file, one compressed with snappy one without compression, you can check it with get-meta command: $ avro-tools getmeta 000000_0
avro.schema {"type":"record","name":"test_avro","namespace":"work","fields":[{"name":"i","type":["null","int"],"default":null},{"name":"s","type":["null","string"],"default":null}]}
$ avro-tools getmeta 000000_0_copy_1
avro.schema {"type":"record","name":"test_avro","namespace":"work","fields":[{"name":"i","type":["null","int"],"default":null},{"name":"s","type":["null","string"],"default":null}]}
avro.codec snappy
... View more
08-23-2018
03:16 AM
There are HUGE tables with a lot of partitions, this case is not unique. Partitioning helps to address the slice of data that matters and each partition contains a lot of data. Oh well... it's BigData at all.
... View more
08-22-2018
09:24 AM
Edit: adding query timeout does not affect this behaviour: Configured Hue to 30sec timeout, but the query is waiting to be closed for more than 2 minutes... This is directly from the Query profile: Query Options (set by configuration): MEM_LIMIT=419430400,QUERY_TIMEOUT_S=30
Query Options (set by configuration and planner): MEM_LIMIT=419430400,QUERY_TIMEOUT_S=30,MT_DOP=0
... View more