About Tomas79

Tomas79 · ‎08-30-2018

Hi, you have to create an ssl_context before you open the connection and point it to the CA certificate: import ssl context = ssl.create_default_context(cafile=cmcertpath) api = ApiResource( cm_host, '7183', username=username, password=password, use_tls=True, ssl_context=context)

Ofcak · ‎08-30-2018

poojary_sudth: Thank you, but that is +- the same set of tasks I did before. I found how to make consumer work, it was necessary to add parameter --partition 0: KAFKA_OPTS="-Djava.security.auth.login.config=/root/jaas.conf" kafka-console-consumer --bootstrap-server ourHost:9092 --topic test --consumer.config /root/client.properties --partition 0 I cannot see all the messages comming into the topic, but at least some of them which fall into specified partition are printed. Which is enought for me to confirm that Kafka broker works. I found this hint here: https://stackoverflow.com/questions/34844209/consumer-not-receiving-messages-kafka-console-new-consumer-api-kafka-0-9

Tomas79 · ‎08-30-2018

Problem resolved: the issue was with the overall memory consumption on the HiveServer2. The hive job submitted do not run in the HS2 JVM, but as an extra java process and the OS was killing it because it could not allocate such amount of memory.

Tomas79 · ‎08-29-2018

You can materialize this metric table with 60 rows, and then Impala will broadcast it to every node (or Hive will run a map-side join) so it will not affect the shuffling over network. However the result set will be N times larger, but that's the point of the query right? (produce from 100k with 10 metrics a 1M table)

bvk · ‎08-27-2018

Hi Tomas, I am using RHEL 7.1

hadoopNoob · ‎08-26-2018

Dr.who issue is very common these days , i am not sure whos exploiting opensource project or sth. but the main cause is usually a remote shell script would be attached to your resource manager node which cause dr.who to spawn. you dont need to kerberize just use some linux firewall Thanks

Tomas79 · ‎08-24-2018

Hi, you can inspect the avro files with avro-tools utility. create table work.test_avro ( i int, s string ) stored as avro; insert into work.test_avro select 1, "abc"; set hive.exec.compress.output = true set hive.exec.compress.intermediate = true; set avro.output.codec= snappy; insert into work.test_avro select 2, "abcdefgb"; In this table there are two file, one compressed with snappy one without compression, you can check it with get-meta command: $ avro-tools getmeta 000000_0 avro.schema {"type":"record","name":"test_avro","namespace":"work","fields":[{"name":"i","type":["null","int"],"default":null},{"name":"s","type":["null","string"],"default":null}]} $ avro-tools getmeta 000000_0_copy_1 avro.schema {"type":"record","name":"test_avro","namespace":"work","fields":[{"name":"i","type":["null","int"],"default":null},{"name":"s","type":["null","string"],"default":null}]} avro.codec snappy

ozw1z5rd · ‎08-23-2018

There are HUGE tables with a lot of partitions, this case is not unique. Partitioning helps to address the slice of data that matters and each partition contains a lot of data. Oh well... it's BigData at all.

Tomas79 · ‎08-22-2018

Edit: adding query timeout does not affect this behaviour: Configured Hue to 30sec timeout, but the query is waiting to be closed for more than 2 minutes... This is directly from the Query profile: Query Options (set by configuration): MEM_LIMIT=419430400,QUERY_TIMEOUT_S=30 Query Options (set by configuration and planner): MEM_LIMIT=419430400,QUERY_TIMEOUT_S=30,MT_DOP=0

Tomas79 · ‎08-06-2018

I am glad that I could help T.

Online	Offline
Last Visited	‎01-14-2021 05:46 AM

Member Since	‎07-01-2015 06:03 AM
Last Visited	‎01-14-2021 05:46 AM
Posts	460
Kudos received	79

Cloudera Community

Re: Read service-wide configuration values via API

Re: Cloudera Altus - create CM with existing postg...

Re: Spark job getting failed with Jupyter notebook

Re: Create Parameterized view Impala

Re: Unable to access NameNode in cross realm trust...

Re: SSL certification validation ignoring "-k" via...

Re: Console producer/consumer not working in kafka...

Re: Hive error: "return code 1 from org.apache.had...

Re: Convert columns to rows

Re: Node manager & Resource manager unexpected exi...

Re: Yarn- Node Manager unexpected exists occurring...

Re: Verify the file's compression properties

Re: Hive drop partitions using range impacts metas...

Re: Impala very high client fetch time in Hue

Re: Cannot access cluster via cm_api