Member since
11-12-2015
90
Posts
1
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5768 | 06-09-2017 01:52 PM | |
13494 | 02-24-2017 02:32 PM | |
11449 | 11-30-2016 02:48 PM | |
3849 | 03-02-2016 11:14 AM | |
4667 | 12-16-2015 07:11 AM |
06-28-2016
10:30 AM
Hello, This is my problem, I have a string columns with values that are separated by ';' , and I want to see it as an array using cast. Here is what I want to do: select cast("hello;how;are;you" as ARRAY(separated by ";")); It is possible to do this?, I'm using Impala 2.5 on CDH 5.7. Regards,
... View more
Labels:
02-25-2016
10:45 AM
When I try to install Oozie it give me this error: 2016-02-25 18:22:24,260 INFO org.apache.oozie.service.ConfigurationService: SERVER[cloudera1] Overriding configuration with system property. Key [oozie.http.port], Value [11000]
2016-02-25 18:22:24,268 WARN org.apache.oozie.service.ConfigurationService: SERVER[cloudera1] Invalid configuration defined, [oozie.service.ProxyUserService.proxyuser.hue.hosts]
2016-02-25 18:22:24,268 WARN org.apache.oozie.service.ConfigurationService: SERVER[cloudera1] Invalid configuration defined, [oozie.service.GroupsService.hadoop.security.group.mapping]
2016-02-25 18:22:24,269 WARN org.apache.oozie.service.ConfigurationService: SERVER[cloudera1] Invalid configuration defined, [oozie.service.ProxyUserService.proxyuser.hue.groups]
2016-02-25 18:22:24,269 WARN org.apache.oozie.service.ConfigurationService: SERVER[cloudera1] Invalid configuration defined, [hadoop.security.credential.provider.path]
2016-02-25 18:22:24,269 WARN org.apache.oozie.service.ConfigurationService: SERVER[cloudera1] Invalid configuration defined, [oozie.email.from.address]
2016-02-25 18:22:24,269 WARN org.apache.oozie.service.ConfigurationService: SERVER[cloudera1] Invalid configuration defined, [oozie.email.smtp.port]
2016-02-25 18:22:24,269 WARN org.apache.oozie.service.ConfigurationService: SERVER[cloudera1] Invalid configuration defined, [oozie.email.smtp.host]
2016-02-25 18:22:24,270 WARN org.apache.oozie.service.ConfigurationService: SERVER[cloudera1] Invalid configuration defined, [oozie.email.smtp.auth]
2016-02-25 18:22:24,274 WARN org.apache.oozie.service.Services: SERVER[cloudera1] System ID [oozie-oozi] exceeds maximum length [10], trimming
2016-02-25 18:22:24,275 INFO org.apache.oozie.service.Services: SERVER[cloudera1] Exiting null Entering NORMAL
2016-02-25 18:22:24,276 INFO oozieops: SERVER[cloudera1] Exiting null Entering NORMAL
2016-02-25 18:22:24,276 INFO org.apache.oozie.service.Services: SERVER[cloudera1] Initialized runtime directory [/tmp/oozie-oozi6638948414612903101.dir] stdout Thu Feb 25 18:22:22 UTC 2016
JAVA_HOME=/usr/lib/jvm/java-7-oracle-cloudera
using 5 as CDH_VERSION
Validate DB Connection stderr Error: Could not connect to the database: org.postgresql.util.PSQLException: The connection attempt failed.
Stack trace for the error was (for debug purposes):
--------------------------------------
java.lang.Exception: Could not connect to the database: org.postgresql.util.PSQLException: The connection attempt failed.
at org.apache.oozie.tools.OozieDBCLI.validateConnection(OozieDBCLI.java:905)
at org.apache.oozie.tools.OozieDBCLI.createDB(OozieDBCLI.java:185)
at org.apache.oozie.tools.OozieDBCLI.run(OozieDBCLI.java:129)
at org.apache.oozie.tools.OozieDBCLI.main(OozieDBCLI.java:80)
Caused by: org.postgresql.util.PSQLException: The connection attempt failed.
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:150)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66)
at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:125)
at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:30)
at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:22)
at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:30)
at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:24)
at org.postgresql.Driver.makeConnection(Driver.java:393)
at org.postgresql.Driver.connect(Driver.java:267)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
at org.apache.oozie.tools.OozieDBCLI.createConnection(OozieDBCLI.java:895)
at org.apache.oozie.tools.OozieDBCLI.validateConnection(OozieDBCLI.java:901)
... 3 more
Caused by: java.net.UnknownHostException: :
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at java.net.Socket.connect(Socket.java:528)
at java.net.Socket.<init>(Socket.java:425)
at java.net.Socket.<init>(Socket.java:208)
at org.postgresql.core.PGStream.<init>(PGStream.java:62)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:76)
... 15 more
-------------------------------------- It successfully creates Oozie database but fails creating oozie database tables. I'm using the last version of CDH in Ubuntu 14.04. Regards,
... View more
Labels:
12-16-2015
07:11 AM
I solved the problem. I had to created a java custom interceptor (based in the one you sent me), compile it with maven and paste it in the flume-ng dir. Thanks pdvorak for all the help 🙂
... View more
12-14-2015
03:18 PM
Yes I tried that. All the fields are set as headers but the message is transformed by: event.setBody("Message modified by Jsoninterceptor".getBytes()); And it becomes unusefull because I need the log as the original. I tried to change the JsonIntersepter.java file in the .jar using vim but it can't be done, I think that is because the .class file. Also tried to create a java morphline but i can't get it compile correctly. morphlines : [
java {
imports : """
import java.util.List;
import java.util.Map;
import org.apache.flume.Context;
import org.apache.flume.Event;
import org.apache.flume.interceptor.Interceptor;
import org.apache.log4j.Logger;
"""
code: """
Map<String, String> headers = event.getHeaders();
// example: add / remove headers
if (headers.containsKey("product")) {
headers.put("product", headers.get("product"));
}
if (headers.containKey("client")){
headers.put("client", headers.get("client"));
}
return event;
"""
}
] Regards,
... View more
12-14-2015
01:10 PM
Hello, I created a java file with the custom interceptor, but I don't know how to compile it or transform it to a jar file properly. I tested the javac and the jar command, put the interceptor builder is not found.
... View more
12-11-2015
10:47 AM
Problem solved. Instead of using: flume1.sources.kafka-source-1.interceptors.i1.serializers.ser1.type = default changed it for: flume1.sources.kafka-source-1.interceptors.i1.serializers.ser1.type = org.apache.flume.interceptor.RegexExtractorInterceptorPassThroughSerializer And it worked fine. I have two more questions: 1) the - (hyphen) cannot be readed as part of a header, so if the value of the header has - its goes to the default and not to the corresponding mapper. 2) I wanna add a second regex but how can I mapp two headers together, for example: flume1.sources.kafka-source-1.selector.header = header1 header2 flume1.sources.kafka-source-1.selector.mapping.(value1)&(value2) = hdfs-channel-x It is possible by doing it without programing it? Because im not a programer. Regards,
... View more
12-11-2015
05:27 AM
I changed the regex but still not working. the whole config file is this: # Sources, channels, and sinks are defined per # agent name, in this case flume1. flume1.sources = kafka-source-1 flume1.channels = hdfs-channel-1 hdfs-channel-2 hdfs-channel-3 hdfs-channel-4 hdfs-channel-5 hdfs-channel-6 hdfs-channel-7 logChannel flume1.sinks = hdfs-sink-1 hdfs-sink-2 hdfs-sink-3 hdfs-sink-4 hdfs-sink-5 hdfs-sink-6 hdfs-sink-7 logSink # For each source, channel, and sink, set # standard properties. flume1.sources.kafka-source-1.type = org.apache.flume.source.kafka.KafkaSource flume1.sources.kafka-source-1.zookeeperConnect = 192.168.70.23:2181 flume1.sources.kafka-source-1.topic = kafkatopic flume1.sources.kafka-source-1.batchSize = 1000 flume1.sources.kafka-source-1.channels = hdfs-channel-1 hdfs-channel-2 hdfs-channel-3 hdfs-channel-4 hdfs-channel-5 hdfs-channel-6 hdfs-channel-7 logChannel flume1.sinks.hdfs-sink-1.channel = hdfs-channel-1 flume1.sinks.hdfs-sink-2.channel = hdfs-channel-2 flume1.sinks.hdfs-sink-3.channel = hdfs-channel-3 flume1.sinks.hdfs-sink-4.channel = hdfs-channel-4 flume1.sinks.hdfs-sink-5.channel = hdfs-channel-5 flume1.sinks.hdfs-sink-6.channel = hdfs-channel-6 flume1.sinks.hdfs-sink-7.channel = hdfs-channel-7 flume1.sinks.logSink.channel = logChannel flume1.channels.hdfs-channel-1.type = memory flume1.channels.hdfs-channel-2.type = memory flume1.channels.hdfs-channel-3.type = memory flume1.channels.hdfs-channel-4.type = memory flume1.channels.hdfs-channel-5.type = memory flume1.channels.hdfs-channel-6.type = memory flume1.channels.hdfs-channel-7.type = memory flume1.channels.logChannel.type = memory flume1.channels.hdfs-channel-1.capacity = 10000 flume1.channels.hdfs-channel-1.transactionCapacity = 1000 flume1.channels.hdfs-channel-2.capacity = 10000 flume1.channels.hdfs-channel-2.transactionCapacity = 1000 flume1.channels.hdfs-channel-3.capacity = 10000 flume1.channels.hdfs-channel-3.transactionCapacity = 1000 flume1.channels.hdfs-channel-4.capacity = 10000 flume1.channels.hdfs-channel-4.transactionCapacity = 1000 flume1.channels.hdfs-channel-5.capacity = 10000 flume1.channels.hdfs-channel-5.transactionCapacity = 1000 flume1.channels.hdfs-channel-6.capacity = 10000 flume1.channels.hdfs-channel-6.transactionCapacity = 1000 flume1.channels.hdfs-channel-7.capacity = 10000 flume1.channels.hdfs-channel-7.transactionCapacity = 1000 flume1.channels.logChannel.capacity = 10000 flume1.channels.logChannel.transactionCapacity = 1000 #Interceptors setup flume1.sources.kafka-source-1.interceptors = i1 flume1.sources.kafka-source-1.interceptors.i1.type = regex_extractor flume1.sources.kafka-source-1.interceptors.i1.regex = "product":"(\\w+)" flume1.sources.kafka-source-1.interceptors.i1.serializers = ser1 flume1.sources.kafka-source-1.interceptors.i1.serializers.ser1.type = default flume1.sources.kafka-source-1.interceptors.i1.serializers.ser1.name = product #checkpoint,smgsyslog, sepsyslog, pgp, bluecoat-syslog,bluecoat # channel selector configuration flume1.sources.kafka-source-1.selector.type = multiplexing flume1.sources.kafka-source-1.selector.header = product flume1.sources.kafka-source-1.selector.mapping.ckeckpoint = hdfs-channel-1 flume1.sources.kafka-source-1.selector.mapping.smgsyslog = hdfs-channel-2 flume1.sources.kafka-source-1.selector.mapping.sepsyslog = hdfs-channel-3 flume1.sources.kafka-source-1.selector.mapping.pgp = hdfs-channel-4 flume1.sources.kafka-source-1.selector.mapping.bluecoat-syslog = hdfs-channel-5 flume1.sources.kafka-source-1.selector.mapping.bluecoat = hdfs-channel-6 flume1.sources.kafka-source-1.selector.default = hdfs-channel-7 logChannel # sinks configuration flume1.sinks.hdfs-sink-1.type = hdfs flume1.sinks.hdfs-sink-1.hdfs.writeFormat = Text flume1.sinks.hdfs-sink-1.hdfs.fileType = DataStream flume1.sinks.hdfs-sink-1.hdfs.filePrefix = test-events flume1.sinks.hdfs-sink-1.hdfs.useLocalTimeStamp = true flume1.sinks.hdfs-sink-1.hdfs.path = /user/root/logs/checkpoint flume1.sinks.hdfs-sink-1.hdfs.rollCount=1000 flume1.sinks.hdfs-sink-1.hdfs.rollSize=0 flume1.sinks.hdfs-sink-2.type = hdfs flume1.sinks.hdfs-sink-2.hdfs.writeFormat = Text flume1.sinks.hdfs-sink-2.hdfs.fileType = DataStream flume1.sinks.hdfs-sink-2.hdfs.filePrefix = test-events flume1.sinks.hdfs-sink-2.hdfs.useLocalTimeStamp = true flume1.sinks.hdfs-sink-2.hdfs.path = /user/root/logs/smgsyslog flume1.sinks.hdfs-sink-2.hdfs.rollCount=1000 flume1.sinks.hdfs-sink-2.hdfs.rollSize=0 flume1.sinks.hdfs-sink-3.type = hdfs flume1.sinks.hdfs-sink-3.hdfs.writeFormat = Text flume1.sinks.hdfs-sink-3.hdfs.fileType = DataStream flume1.sinks.hdfs-sink-3.hdfs.filePrefix = test-events flume1.sinks.hdfs-sink-3.hdfs.useLocalTimeStamp = true flume1.sinks.hdfs-sink-3.hdfs.path = /user/root/logs/sepsyslog flume1.sinks.hdfs-sink-3.hdfs.rollCount=1000 flume1.sinks.hdfs-sink-3.hdfs.rollSize=0 flume1.sinks.hdfs-sink-4.type = hdfs flume1.sinks.hdfs-sink-4.hdfs.writeFormat = Text flume1.sinks.hdfs-sink-4.hdfs.fileType = DataStream flume1.sinks.hdfs-sink-4.hdfs.filePrefix = test-events flume1.sinks.hdfs-sink-4.hdfs.useLocalTimeStamp = true flume1.sinks.hdfs-sink-4.hdfs.path = /user/root/logs/pgp flume1.sinks.hdfs-sink-4.hdfs.rollCount=1000 flume1.sinks.hdfs-sink-4.hdfs.rollSize=0 flume1.sinks.hdfs-sink-5.type = hdfs flume1.sinks.hdfs-sink-5.hdfs.writeFormat = Text flume1.sinks.hdfs-sink-5.hdfs.fileType = DataStream flume1.sinks.hdfs-sink-5.hdfs.filePrefix = test-events flume1.sinks.hdfs-sink-5.hdfs.useLocalTimeStamp = true flume1.sinks.hdfs-sink-5.hdfs.path = /user/root/logs/bluecoatsyslog flume1.sinks.hdfs-sink-5.hdfs.rollCount=1000 flume1.sinks.hdfs-sink-5.hdfs.rollSize=0 flume1.sinks.hdfs-sink-6.type = hdfs flume1.sinks.hdfs-sink-6.hdfs.writeFormat = Text flume1.sinks.hdfs-sink-6.hdfs.fileType = DataStream flume1.sinks.hdfs-sink-6.hdfs.filePrefix = test-events flume1.sinks.hdfs-sink-6.hdfs.useLocalTimeStamp = true flume1.sinks.hdfs-sink-6.hdfs.path = /user/root/logs/bluecoat flume1.sinks.hdfs-sink-6.hdfs.rollCount=1000 flume1.sinks.hdfs-sink-6.hdfs.rollSize=0 flume1.sinks.hdfs-sink-7.type = hdfs flume1.sinks.hdfs-sink-7.hdfs.writeFormat = Text flume1.sinks.hdfs-sink-7.hdfs.fileType = DataStream flume1.sinks.hdfs-sink-7.hdfs.filePrefix = test-events flume1.sinks.hdfs-sink-7.hdfs.useLocalTimeStamp = true flume1.sinks.hdfs-sink-7.hdfs.path = /user/root/logs/otros flume1.sinks.hdfs-sink-7.hdfs.rollCount=1000 flume1.sinks.hdfs-sink-7.hdfs.rollSize=0 flume1.sinks.logSink.type = logger # Other properties are specific to each type of # source, channel, or sink. In this case, we # specify the capacity of the memory channel. I think that is somthing wrong with the channels but i dont know what is the problem. The logger output without the interceptor part has two headers, timestamp and topic .
... View more
12-10-2015
11:44 AM
I added a interceptor that finds the field product in the log and creates a header with it. This is the code, and is not working. What could be wong? #Interceptors setup
flume1.sources.kafka-source-1.interceptors = i1
flume1.sources.kafka-source-1.interceptors.i1.type = regex_extractor
flume1.sources.kafka-source-1.interceptors.i1.regex = "product":"(\\d+)"
flume1.sources.kafka-source-1.interceptors.i1.serializers = ser1
flume1.sources.kafka-source-1.interceptors.i1.serializers.ser1.type = default
flume1.sources.kafka-source-1.interceptors.i1.serializers.ser1.name = product the field product in the log is like this ...,"product":"smgsyslog",...
... View more
12-10-2015
04:52 AM
This is the result: 2015-12-10 09:38:59,065 INFO org.apache.solr.servlet.SolrDispatchFilter: [admin] webapp=null path=/admin/cores params={action=STATUS&wt=json} status=0 QTime=0 The headers are status and Qtime? and if they are, how can I make that a field of a log is read as a header?.
... View more
- « Previous
- Next »