Member since
12-12-2014
1
Post
0
Kudos Received
0
Solutions
12-13-2014
10:55 AM
The intermediate_access_log table is not intended to be viewed directly, especially not in Impala. In that tutorial step you're actually using Hive to do an ETL (extract transform load) job. The Apache logs are in a format that is hard to query directly through SQL, so we use one of Hive's extensions to express a regular expression to break up the fields more explicitly. After this step, the intermediate table is not useful. It's the second table you create (tokenized_access_logs) that should be queried from Impala.
... View more