About bhagan

bhagan · ‎11-07-2016

Hi Gobi, In your KafkaProducer constructor, you instantiate the class with a set of Properties, which should include a list of Brokers. This allows the Producer to have knowledge of more than one server. If you only have one server listed, then, yes, if that server goes down, your Producer will be unable to send any more messages. However, this scenario is highly unlikely because it is a best practice to use more than one Broker in your cluster. One benefit of configuring your Producer with a list of servers allows you to send messages without having to worry about the IP address of the particular server that will receive your messages. In terms of defining the topic, to which you will send your messages, this is defined in the ProducerRecord and can be achieved with something like this: Properties props = new Properties(); props.put(“bootstrap.servers”, “192.168.86.10:9092, host2:port, host3:port”); Producer<String, String> producer = new KafkaProducer<>(props); producer.send(new ProducerRecord<String, String>(“test-topic”, “hello distributed commit log”)); Have a great day, Brian

bhagan · ‎11-04-2016

To do incremental imports, you need to add a few more arguments to your command. Take a look at the following description from the sqoop docs: https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_incremental_imports Inside your shell script, I’m not sure how you’re setting the values of your named parameters, unless you have more code that is mapping the positional parameters to local variables. When using the lastmodified mode, it requires that a field in your table to be a timestamp, which will be updated with the current timestamp during update. Additionally, the update requires a check column, which can be of type timestamp also.

bhagan · ‎11-04-2016

Hello! This is definitely not a single point of failure in a Kafka cluster. Let me quote from the Kafka doc: "Each partition has one server which acts as the "leader" and zero or more servers which act as "followers". The leader handles all read and write requests for the partition while the followers passively replicate the leader. If the leader fails, one of the followers will automatically become the new leader. Each server acts as a leader for some of its partitions and a follower for others so load is well balanced within the cluster." And there is a nice explanation of how leader election is performed: https://kafka.apache.org/documentation#design_replicatedlog "Kafka takes a slightly different approach to choosing its quorum set. Instead of majority vote, Kafka dynamically maintains a set of in-sync replicas (ISR) that are caught-up to the leader. Only members of this set are eligible for election as leader. A write to a Kafka partition is not considered committed until all in-sync replicas have received the write. This ISR set is persisted to ZooKeeper whenever it changes. Because of this, any replica in the ISR is eligible to be elected leader."

bhagan · ‎10-31-2016

Could you provide us the output from the ls -la command for /usr/bin/ranger-admin-stop?

bhagan · ‎10-27-2016

Joe, what do your hostnames look like? I always create hostnames with host.domain.top-level-domain. For example, in a small cluster, I might name a node, centos7.node1.localdomain.

bhagan · ‎10-25-2016

Sami, I don't see keywords listed as a property for the TwitterSource https://flume.apache.org/FlumeUserGuide.html#twitter-1-firehose-source-experimental However, your upload looks to be an avro file, which is what the documentation says you will receive from the source. What is it about your result that you think is incorrect?

bhagan · ‎10-25-2016

The Ranger KMS has import/export scripts that you can use on both the source and target clusters. So you can export the keys from the source cluster, copy them over to the target cluster, import them into the target KMS, create your encryption zones on the target using the imported keys, and use distcp as described in the guide.

bhagan · ‎10-24-2016

With the new Sandbox, you should be able to access http://localhost:8888, which will bring you to a to the Sandbox start page. I suggest clicking into the Advanced HDP page. From there you will see a link to a web-based ssh client. You could launch the Pig and Hive command lines clients from there.

bhagan · ‎10-24-2016

You’re seeing the effects of the sandbox running as a docker container in a VM on your host. In the image that you uploaded, you have ssh’d into the docker container, which runs in its own network, so that is why you are unable to reach the site at the IP address. However, you should be able to use localhost and get a response from the sandbox. Before starting the tutorial, with the new sandbox, you should be able to access http://localhost:8888, which will bring you to a to the Sandbox start page. I suggest clicking into the Advanced HDP page. From there you will see links to Ambari and a web-based command line.

bhagan · ‎10-19-2016

Unfortunately, not yet. At this point, Atlas is best used with Hive, capturing the following events: create database create table/view, create table as select load, import, export DMLs (insert) alter database alter table (skewed table information, stored as, protection is not supported) alter view Also, Atlas works with Sqoop, capturing events from the hive import command.

Online	Offline
Last Visited	‎01-10-2022 11:19 AM

Member Since	‎09-29-2015 03:09 PM
Last Visited	‎01-10-2022 11:19 AM
Posts	142
Kudos received	45

Cloudera Community

Re: HIVE insert/update/delete

Re: updating and inserting new data to mysql using...

Re: requirement for ACLs

Re: Ambari - adding custom service

Re: nifi dataflow - get result, parse it, and save...

Re: Kafka leader election

Re: how to do incremental load using sqoop shell s...

Re: Kafka leader election

Re: Cannot start Ranger admin process with mysql

Re: Ambari Cluster setup is not proceeding further...

Re: bad HDFS sink property

Re: Using Distcp on Encryption Zones

Re: In Horotonworks sandbox 2.5 how to open HIVE &...

Re: IP takes me to Tomcat Instead of Ambari

Re: Can Atlas fetch metadata from Informatica gove...