Support Questions

Find answers, ask questions, and share your expertise

Who agreed with this solution

avatar
Guru
The intermediate_access_log table is not intended to be viewed directly,
especially not in Impala. In that tutorial step you're actually using Hive
to do an ETL (extract transform load) job. The Apache logs are in a format
that is hard to query directly through SQL, so we use one of Hive's
extensions to express a regular expression to break up the fields more
explicitly. After this step, the intermediate table is not useful. It's the
second table you create (tokenized_access_logs) that should be queried from
Impala.

View solution in original post

Who agreed with this solution