Currently only works in "hive" shell, does not work in beeline
<code>su guest
hive
paste the following code into the hive shellthis will use Groovy String replace function to replace all instances of lower case 'e' with 'E'
<code>compile `import org.apache.hadoop.hive.ql.exec.UDF \;
import org.apache.hadoop.io.Text \;
public class Replace extends UDF {
public Text evaluate(Text s){
if (s == null) return null \;
return new Text(s.toString().replace('e', 'E')) \;
}
} ` AS GROOVY NAMED Replace.groovy;
now create a temporary function to leverage the Groovy UDF
<code>CREATE TEMPORARY FUNCTION Replace as 'Replace';
now you can use the function in your SQL
<code>SELECT Replace(description) FROM sample_08 limit 5;
full example
<code>hive> compile `import org.apache.hadoop.hive.ql.exec.UDF \;
> import org.apache.hadoop.io.Text \;
> public class Replace extends UDF {
> public Text evaluate(Text s){
> if (s == null) return null \;
> return new Text(s.toString().replace('e', 'E')) \;
> }
> } ` AS GROOVY NAMED Replace.groovy;
Added [/tmp/0_1452022176763.jar] to class path
Added resources: [/tmp/0_1452022176763.jar]
hive> CREATE TEMPORARY FUNCTION Replace as 'Replace';
OK
Time taken: 1.201 seconds
hive> SELECT Replace(description) FROM sample_08 limit 5;
OK
All Occupations
ManagEmEnt occupations
ChiEf ExEcutivEs
GEnEral and opErations managErs
LEgislators
Time taken: 6.373 seconds, Fetched: 5 row(s)
hive>
Another example
this will duplicate any String passed to the function
<code>compile `import org.apache.hadoop.hive.ql.exec.UDF \;
import org.apache.hadoop.io.Text \;
public class Duplicate extends UDF {
public Text evaluate(Text s){
if (s == null) return null \;
return new Text(s.toString() * 2) \;
}
} ` AS GROOVY NAMED Duplicate.groovy;
CREATE TEMPORARY FUNCTION Duplicate as 'Duplicate';
SELECT Duplicate(description) FROM sample_08 limit 5;
All OccupationsAll Occupations
Management occupationsManagement occupations
Chief executivesChief executives
General and operations managersGeneral and operations managers
LegislatorsLegislators
JSON Parsing UDF
<code>compile `import org.apache.hadoop.hive.ql.exec.UDF \;
import groovy.json.JsonSlurper \;
import org.apache.hadoop.io.Text \;
public class JsonExtract extends UDF {
public int evaluate(Text a){
def jsonSlurper = new JsonSlurper() \;
def obj = jsonSlurper.parseText(a.toString())\;
return obj.val1\;
}
} ` AS GROOVY NAMED json_extract.groovy;
CREATE TEMPORARY FUNCTION json_extract as 'JsonExtract';
SELECT json_extract('{"val1": 2}') from date_dim limit 1;
2