Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Exercise 2 : Long time running + error from intermediate_access_logs table creation query

Solved Go to solution

Re: Exercise 2 : Long time running + error from intermediate_access_logs table creation query

Contributor

Absolutely! You could also mark the solution for anyone else that comes across this issue in the future.

 

Cheers

Re: Exercise 2 : Long time running + error from intermediate_access_logs table creation query

New Contributor
I added the jar file and still my query is running for the last 8 minutes.

CREATE EXTERNAL TABLE intermediate_access_logs (
ip STRING,
date STRING,
method STRING,
url STRING,
http_version STRING,
code1 STRING,
code2 STRING,
dash STRING,
user_agent STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
'input.regex' = '([^ ]*) - - \\[([^\\]]*)\\] "([^\ ]*) ([^\ ]*) ([^\ ]*)" (\\d*) (\\d*) "([^"]*)" "([^"]*)"',
'output.format.string' = "%1$$s %2$$s %3$$s %4$$s %5$$s %6$$s %7$$s %8$$s %9$$s")
LOCATION '/user/hive/warehouse/original_access_logs';

Any other setting change needs to be done?
Highlighted

Re: Exercise 2 : Long time running + error from intermediate_access_logs table creation query

New Contributor

I had the same problem - create external table intermediate_access_logs hanging.

 

The solution which helped in my case was increasing memory in my virtual machine (VirtualBox). I've set VM memory to 20GB. I did not experiment, I don't know what is the working minimum. What I can tell is that 20GB works.