02-16-2017 12:45 PM
I have been trying to put together a regex to create a Hive external table on a log (text) file, which has some log entries spanning across various lines; but I haven't been successful yet.
Does anyone have any experience with such a requirement? Does Hive Regex Serde support multiple lines?
Here's the sample text from the log file I am working with:
[2016-12-31T14:30:03.917+00:00] [SomeText] [NOTIFICATION:1]   [ecid: 005HFKbQ7Z_3zyZ8DyW0001DO000CH3,0:1:2:2:1:6] [tid: 514cd700]  Init block, 'Sample Tables', has more variables than the query select list.
[2016-12-31T14:30:03.918+00:00] [SomeText] [NOTIFICATION:1]   [ecid: 005HFKbPIy83zZ8DyW0001DO000CGz,0:1:2:2:1:6] [tid: 50875700]  Init block, 'Sample Tables', has more variables than the query select list.
[2016-12-31T14:56:18.467+00:00] [SomeText] [NOTIFICATION:1]   [ecid: 005HFM4dHTk3z8DyW0001A60020wY,0] [tid: 5fe23700]  Operation Purge Query Plan Cache succeeded!
[2016-12-31T15:00:01.46+00:00] [SomeText] [ERROR:1]   [ecid: 005HFMIa1ln3yZ8DyW0001A600211s,0:6] [tid: 64df7700] [nQSError: 17014] Could not connect to Oracle database. [[
Properties: description=RpScopeVar Exchange; producerID=1405455368; requestID=429096; sessionID=4236224; userName=User;
[nQSError: 17001] Oracle Error code: 12541, message: ORA-12541: TNS:no listener
at OCI call OCILogon.
********** Task: 1. Running for (mls): 82 **********
Description: DB Connect
DSN: TEXT; userName=USER
********** Task: 2. Running for (mls): 82 **********
Description: RpScopeVar GatewayDbGateway Prepare
DSN:Forecasting OLTP Connection Pool;userName:User
SQL:select max(DT) from APPS.TBL where TS is not null and DT <= SYSDATE
********** Task: 3. Running for (mls): 82 **********
Repository Name::Star Subject Area Name:: User Name::User
Logical Hash of SQL:: 0x0
The goal is to write a regex and create Hive table on top of this file. The highlighted text should either be in a single field or broken into two fields - one containing the text before "[[" and 2nd containing everything between "[[" and "]]".
Any help would be greatly appreciated.