Member since
10-01-2015
3933
Posts
1150
Kudos Received
374
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3365 | 05-03-2017 05:13 PM | |
2796 | 05-02-2017 08:38 AM | |
3076 | 05-02-2017 08:13 AM | |
3006 | 04-10-2017 10:51 PM | |
1517 | 03-28-2017 02:27 AM |
01-29-2017
11:42 AM
1 Kudo
Apache Storm view can provide some functionality with deployed storm topologies to start/pause/restart them. Best experience would be with Apache NiFi where Ambari manages Nifi operation and full control of data streams or flows in Nifi jargon, will be provided with Nifi OOTB. Nifi has error handling, queing, backpessure, scheduling, expiry of data controls and many more. If you are looking for Flume replacement, Nifi is best bet and best of all, it is decoupled from HDP and offers bidirectional flows of data, to and from Hadoop.
... View more
01-27-2017
11:57 PM
4 Kudos
Avro in HDF is 1.7.7 and timestamp was only introduced in Avro 1.8.x. I would suggest to treat the timestamp field as string.
... View more
01-26-2017
01:20 AM
2 Kudos
Something I'd like to suggest is the following based on the assumption storage savings is the primary goal here 1. Leverage HDFS tiered storage tier called ARCHIVE http://www.ebaytechblog.com/2015/01/12/hdfs-storage-efficiency-using-tiered-storage/ 2. Erasure Coding is a new mechanism soon to be delivered in HDP that promises same fault tolerance guarantees as replication factor of 3 but with 1.5x storage savings. Which means you no longer need to store 3 block replicas but only 1.5 of that. https://hadoop.apache.org/docs/r3.0.0-alpha1/hadoop-project-dist/hadoop-hdfs/HDFSErasureCoding.htm I'd consider these paths before reducing block size.
... View more
01-21-2017
03:30 PM
@Joan Viladrosa there was a JIRA https://issues.apache.org/jira/browse/HBASE-5147 that proposed such functionality but since major compaction is part of normal operations for HBASE, it was marked as resolved. I don't know of any way of stopping compaction in flight in a gracious manner. Perhaps you'd want to open a JIRA again and try your luck? In general, you are better off turning off future compactions and manage schedule via cron. Seems to me this is a minor inconvenience and you won't get much luck getting it through.
... View more
01-17-2017
02:58 PM
1. Only once 2. Use decision property https://www.infoq.com/articles/oozieexample/
... View more
01-15-2017
03:15 PM
Looks like from comments on the following Jira, Spark2 support will arrive with Oozie 5.0 https://issues.apache.org/jira/plugins/servlet/mobile#issue/OOZIE-2767
... View more
01-14-2017
08:06 PM
I am not sure if Spark2 is supported via Oozie yet but let's say it does, did you add the spark 2 libraries to Oozie sharelib? That will be your first step.
... View more
01-14-2017
04:33 PM
I don't see an example but theres a mention of parameter called queueName, can you pass that as argument? Most likely your original job.Properties file should have that defined. So you can override in your proxy call that parameter with correct queue. Updating coordinator definition and properties
Existing coordinator definition and properties will be replaced by new definition and properties. Refer Updating coordinator definition and properties
PUT oozie/v2/job/0000000-140414102048137-oozie-puru-C?action=update
Response:
HTTP/1.1 200 OK
Content-Type: application/json;charset=UTF-8
{"update":
{"diff":"**********Job definition changes**********\n******************************************\n**********Job conf changes****************\n@@ -8,16 +8,12 @@\n
<value>hdfs:\/\/localhost:9000\/user\/purushah\/examples\/apps\/aggregator\/coordinator.xml<\/value>\r\n <\/property>\r\n <property>\r\n
- <name>user.name<\/name>\r\n
- <value>purushah<\/value>\r\n
- <\/property>\r\n
- <property>\r\n <name>start<\/name>\r\n
<value>2010-01-01T01:00Z<\/value>\r\n <\/property>\r\n <property>\r\n
- <name>newproperty<\/name>\r\n
- <value>new<\/value>\r\n
+ <name>user.name<\/name>\r\n
+ <value>purushah<\/value>\r\n <\/property>\r\n <property>\r\n
<name>queueName<\/name>\r\n******************************************\n"
}
} queueName = default
... View more
01-14-2017
04:14 PM
It is possible that your topology deployed already, please run storm list command to check its status. Other than that, I'm wondering whether you're running this code on Sandbox 2.5. from the code, looks like you may need Sandbox 2.4.
... View more
01-12-2017
05:39 PM
I can't find exact reference to it but seems to handle the finished file you can implement an easy solution by doing the following Both the HDFS bolt and Trident State implementation allow you to register any number of RotationAction s. What RotationAction s do is provide a hook to allow you to perform some action right after a file is rotated. For example, moving a file to a different location or renaming it. public class MoveFileAction implements RotationAction {
private static final Logger LOG = LoggerFactory.getLogger(MoveFileAction.class);
private String destination;
public MoveFileAction withDestination(String destDir){
destination = destDir;
return this;
}
If you are using Trident and sequence files you can do something like this: HdfsState.Options seqOpts = new HdfsState.SequenceFileOptions()
.withFileNameFormat(fileNameFormat)
.withSequenceFormat(new DefaultSequenceFormat("key", "data"))
.withRotationPolicy(rotationPolicy)
.withFsUrl("hdfs://localhost:54310")
.addRotationAction(new MoveFileAction().withDestination("/dest2/"));
... View more