About aervits

yashwant_chandr · ‎01-13-2016

Thanks all of you for your answers. Below are answers that i got from Apache atlas developer Apache atlas supports integration with hive. limited integration with storm, kafka, sqoop and falcon is available in 0.6 Atlas metadata is stored in titan graph Atlas doesn’t support metadata exchange with informatica currently. Its in the roadmap

vidyaranya_kupp · ‎02-05-2016

@rich Thanks rich. your solution worked out. I accept this answer.

aervits · ‎01-05-2016

Groovy UDF example Can be compiled at run time Currently only works in "hive" shell, does not work in beeline <code>su guest hive paste the following code into the hive shellthis will use Groovy String replace function to replace all instances of lower case 'e' with 'E' <code>compile `import org.apache.hadoop.hive.ql.exec.UDF \; import org.apache.hadoop.io.Text \; public class Replace extends UDF { public Text evaluate(Text s){ if (s == null) return null \; return new Text(s.toString().replace('e', 'E')) \; } } ` AS GROOVY NAMED Replace.groovy; now create a temporary function to leverage the Groovy UDF <code>CREATE TEMPORARY FUNCTION Replace as 'Replace'; now you can use the function in your SQL <code>SELECT Replace(description) FROM sample_08 limit 5; full example <code>hive> compile `import org.apache.hadoop.hive.ql.exec.UDF \; > import org.apache.hadoop.io.Text \; > public class Replace extends UDF { > public Text evaluate(Text s){ > if (s == null) return null \; > return new Text(s.toString().replace('e', 'E')) \; > } > } ` AS GROOVY NAMED Replace.groovy; Added [/tmp/0_1452022176763.jar] to class path Added resources: [/tmp/0_1452022176763.jar] hive> CREATE TEMPORARY FUNCTION Replace as 'Replace'; OK Time taken: 1.201 seconds hive> SELECT Replace(description) FROM sample_08 limit 5; OK All Occupations ManagEmEnt occupations ChiEf ExEcutivEs GEnEral and opErations managErs LEgislators Time taken: 6.373 seconds, Fetched: 5 row(s) hive> Another example this will duplicate any String passed to the function <code>compile `import org.apache.hadoop.hive.ql.exec.UDF \; import org.apache.hadoop.io.Text \; public class Duplicate extends UDF { public Text evaluate(Text s){ if (s == null) return null \; return new Text(s.toString() * 2) \; } } ` AS GROOVY NAMED Duplicate.groovy; CREATE TEMPORARY FUNCTION Duplicate as 'Duplicate'; SELECT Duplicate(description) FROM sample_08 limit 5; All OccupationsAll Occupations Management occupationsManagement occupations Chief executivesChief executives General and operations managersGeneral and operations managers LegislatorsLegislators JSON Parsing UDF <code>compile `import org.apache.hadoop.hive.ql.exec.UDF \; import groovy.json.JsonSlurper \; import org.apache.hadoop.io.Text \; public class JsonExtract extends UDF { public int evaluate(Text a){ def jsonSlurper = new JsonSlurper() \; def obj = jsonSlurper.parseText(a.toString())\; return obj.val1\; } } ` AS GROOVY NAMED json_extract.groovy; CREATE TEMPORARY FUNCTION json_extract as 'JsonExtract'; SELECT json_extract('{"val1": 2}') from date_dim limit 1; 2

nsabharwal · ‎02-02-2016

Thanks @sindhu seenivasan for the final followup

aervits · ‎01-06-2016

Glad that's its not an abandoned feature. Are there more examples and/or docs available? I created a few of my own but I think we need better examples. Thank you @gopal

jmedel · ‎03-08-2016

Hey guys. The tutorial mentioned above has been updated and is also compatible with the latest Sandbox HDP 2.4. It addresses the issue of permissions. Here is the link: http://hortonworks.com/hadoop-tutorial/how-to-process-data-with-apache-hive/ When you a chance, can you go through the tutorial on our new Sandbox?

tech · ‎01-21-2016

Today I spoke with Robert Molina from Hortonworks and possibly found what is creating all those alerts! The sandbox is intended to be run on a desktop with a NAT networked interface. I set it up on a dedicaded headless server with a bridge adaptor. Looks like sandbox have a problem with that and that cause some of the services configs to not function properly! As a result some services works but reports network connections alerts! After some config change. The related alerts weren't there anymore. So always use a vm for and how it was intended to be used. Thanks to the hortonworks team and Robert who wanted to go to the bottom of this. Conclusion: If you want, like me, to test drive hortonworks on a headless server. Start from scratch and build it! What every sysadmin should do anyways... That's what I'll do this week end... P

sunile_manjee · ‎01-01-2016

Good call Vladimir. mkdir: Permission denied: user=yarn, access=WRITE, inode="/user/ambari-qa/falcon/demo/primary/input/enron/2015-12-30-01":ambari-qa:hdfs:drwxr-xr-x I executed the job from falcon using ambari-qa. Is there any configuration I can change so it uses the user ambari-qa during execution?

mlanciaux · ‎01-04-2016

Dear Grace, We can start with this template and improve it : #!/bin/bash kinit ...... hdfs dfs -rm -r hdfs://.... sqoop import --connect "jdbc:sqlserver://....:1433;username=.....;password=….;database=....DB" --table ..... \ -m 1 --where "...... > 0" CR=$? if [ $CR -ne 0 ]; then echo 'Sqoop job failed' exit 1 fi hdfs dfs -cat hdfs://...../* > export_fs_table.txt CR=$? if [ $CR -ne 0 ]; then echo 'hdfs cat failed' exit 1 fi while IFS=',' read -r id tablename nbr flag; do sqoop import --connect "jdbc:sqlserver://......:1433;username=......;password=......;database=.......DB" --table $tablename CR=$? if [ $CR -ne 0 ]; then echo 'sqoop import failed for '$tablename exit 1 fi done < export_fs_table.txt Kind regards

aervits · ‎12-30-2015

I’m going to show you a neat way to work with CSV files and Apache Hive. Usually, you’d have to do some preparatory work on CSV data before you can consume it with Hive but I’d like to show you a built-in SerDe (Serializer/Deseriazlier) for Hive that will make it a lot more convenient to work with CSV. This work was merged in Hive 0.14 and there’s no additional steps necessary to work with CSV from Hive. Suppose you have a CSV file with the following entries id first_name last_name email gender ip_address 1 James Coleman jcoleman0@cam.ac.uk Male 136.90.241.52 2 Lillian Lawrence llawrence1@statcounter.com Female 101.177.15.130 3 Theresa Hall thall2@sohu.com Female 114.123.153.64 4 Samuel Tucker stucker3@sun.com Male 89.60.227.31 5 Emily Dixon edixon4@surveymonkey.com Female 119.92.21.19 to consume it from within Hive, you’ll need to upload it to hdfs hdfs dfs -put sample.csv /tmp/serdes/ now all it takes is to create a table schema on top of the file drop table if exists sample; create external table sample(id int,first_name string,last_name string,email string,gender string,ip_address string) row format serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde' stored as textfile location '/tmp/serdes/'; now you can query the table as is select * from sample limit 10; but what if your CSV file was tab-delimited rather than comma? well the SerDe got you covered there too: drop table if exists sample; create external table sample(id int,first_name string,last_name string,email string,gender string,ip_address string) row format serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde' with serdeproperties ( "separatorChar" = "\t" ) stored as textfile location '/tmp/serdes/'; notice the separatorChar argument, in all, the SerDe accepts two more arguments; custom escape characters and quote characters Take a look at the wiki for more info https://cwiki.apache.org/confluence/display/Hive/CSV+Serde.

Online	Offline
Last Visited	‎08-15-2019 06:35 AM

Member Since	‎10-01-2015 11:46 AM
Last Visited	‎08-15-2019 06:35 AM
Posts	3,933
Kudos received	1074

Cloudera Community

Re: Where can I get latest resource_management.c...

Re: How to Kerberize Flume?

Re: Load Hive Table form Pig Output File.

Re: HDP 2.6 Cluster Issues with Hive Metastore

Re: which HDP release will storm 1.1.0 be packaged...

Re: Does Atlas support Pig, Sqoop

Re: SQOOP - MYSQL - Practice test instance

Apache Hive Groovy UDF examples

Re: What is 100% free method to install HDP sandbo...

Re: status of groovy custom udfs in beeline

Re: Error on Tutorial "How to Process Data with Ap...

Re: alert on sandbox.hortonworks.com HDP 2.3.2

Re: Falcon tutorial - rawEmailIngestProcess shell-...

Re: Sqoop - dynamically import from SQL server

Apache Hive CSV SerDe Example