Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive Block Sampling by Row Count does not work via Hue in Quickstart VM

Hive Block Sampling by Row Count does not work via Hue in Quickstart VM

New Contributor

Hi,

 

I am trying to sample data in a Hive table via Hue in the Quickstart VM using the Block Sampling method. The query generates a parse exception when I use the row count basis but succeeds when the percentage basis is used. The Hive language manual indicates both methods are supported and, indeed, they both work when I try the queries in the Hortonworks sandbox VM.

 

Any assistance you could give would be much appreciated. Thanks in advance.

 

The queries I am using are:

 

Successful query:

 

  select * from default.sample_07  tablesample(0.1 percent)

 

Unsuccessful query:

 

  select * from default.sample_07  tablesample(50 rows)

 

Error message:

 

  OK FAILED: ParseException line 1:48 mismatched input 'rows' expecting KW_PERCENT near '50' in table split sample specification

 

1 REPLY 1
Highlighted

Re: Hive Block Sampling by Row Count does not work via Hue in Quickstart VM

Master Guru
CDH4's Hive (Apache Hive 0.10 based) lacks support for certain sampling syntaxes as they were added in 0.11 onwards.

Please try using the CDH5 QuickStart VM (or Cloudera Live at http://demo.gethue.com). CDH5 includes a Apache Hive 0.12 with important backports from 0.13.
Don't have an account?
Coming from Hortonworks? Activate your account here