Member since
09-29-2015
142
Posts
45
Kudos Received
15
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1690 | 06-08-2017 05:28 PM | |
6172 | 05-30-2017 02:07 PM | |
1540 | 05-26-2017 07:48 PM | |
3842 | 04-28-2017 02:48 PM | |
2350 | 04-28-2017 02:41 PM |
11-07-2016
02:44 PM
Hi Gobi, In your KafkaProducer constructor, you instantiate the class with a set of Properties, which should include a list of Brokers. This allows the Producer to have knowledge of more than one server. If you only have one server listed, then, yes, if that server goes down, your Producer will be unable to send any more messages. However, this scenario is highly unlikely because it is a best practice to use more than one Broker in your cluster. One benefit of configuring your Producer with a list of servers allows you to send messages without having to worry about the IP address of the particular server that will receive your messages. In terms of defining the topic, to which you will send your messages, this is defined in the ProducerRecord and can be achieved with something like this: Properties props = new Properties();
props.put(“bootstrap.servers”, “192.168.86.10:9092, host2:port, host3:port”);
Producer<String, String> producer = new KafkaProducer<>(props);
producer.send(new ProducerRecord<String, String>(“test-topic”, “hello distributed commit log”)); Have a great day, Brian
... View more
11-04-2016
07:45 PM
1 Kudo
To do incremental imports, you need to add a few more arguments to your command. Take a look at the following description from the sqoop docs: https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_incremental_imports Inside your shell script, I’m not sure how you’re setting the values of your named parameters, unless you have more code that is mapping the positional parameters to local variables. When using the lastmodified mode, it requires that a field in your table to be a timestamp, which will be updated with the current timestamp during update. Additionally, the update requires a check column, which can be of type timestamp also.
... View more
11-04-2016
06:27 PM
Hello! This is definitely not a single point of failure in a Kafka cluster. Let me quote from the Kafka doc: "Each partition has one server which acts as the "leader" and zero or more servers which act as "followers". The leader handles all read and write requests for the partition while the followers passively replicate the leader. If the leader fails, one of the followers will automatically become the new leader. Each server acts as a leader for some of its partitions and a follower for others so load is well balanced within the cluster." And there is a nice explanation of how leader election is performed: https://kafka.apache.org/documentation#design_replicatedlog "Kafka takes a slightly different approach to choosing its quorum set. Instead of majority vote, Kafka dynamically maintains a set of in-sync replicas (ISR) that are caught-up to the leader. Only members of this set are eligible for election as leader. A write to a Kafka partition is not considered committed until all in-sync replicas have received the write. This ISR set is persisted to ZooKeeper whenever it changes. Because of this, any replica in the ISR is eligible to be elected leader."
... View more
10-31-2016
05:26 PM
Could you provide us the output from the ls -la command for /usr/bin/ranger-admin-stop?
... View more
10-27-2016
04:00 PM
Joe, what do your hostnames look like? I always create hostnames with host.domain.top-level-domain. For example, in a small cluster, I might name a node, centos7.node1.localdomain.
... View more
10-25-2016
06:36 PM
Sami, I don't see keywords listed as a property for the TwitterSource https://flume.apache.org/FlumeUserGuide.html#twitter-1-firehose-source-experimental However, your upload looks to be an avro file, which is what the documentation says you will receive from the source. What is it about your result that you think is incorrect?
... View more
10-25-2016
02:06 PM
The Ranger KMS has import/export scripts that you can use on both the source and target clusters. So you can export the keys from the source cluster, copy them over to the target cluster, import them into the target KMS, create your encryption zones on the target using the imported keys, and use distcp as described in the guide.
... View more
10-24-2016
01:26 PM
With the new Sandbox, you should be able to access http://localhost:8888, which will bring you to a to the Sandbox start page. I suggest clicking into the Advanced HDP page. From there you will see a link to a web-based ssh client. You could launch the Pig and Hive command lines clients from there.
... View more
10-24-2016
01:23 PM
2 Kudos
You’re seeing the effects of the sandbox running as a docker container in a VM on your host. In the image that you uploaded, you have ssh’d into the docker container, which runs in its own network, so that is why you are unable to reach the site at the IP address. However, you should be able to use localhost and get a response from the sandbox. Before starting the tutorial, with the new sandbox, you should be able to access http://localhost:8888, which will bring you to a to the Sandbox start page. I suggest clicking into the Advanced HDP page. From there you will see links to Ambari and a web-based command line.
... View more
10-19-2016
03:37 PM
Unfortunately, not yet. At this point, Atlas is best used with Hive, capturing the following events:
create database create table/view, create table as select load, import, export DMLs (insert) alter database alter table (skewed table information, stored as, protection is not supported) alter view Also, Atlas works with Sqoop, capturing events from the hive import command.
... View more