Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Please if anyone can give me good examples of HIve Queries used with RSTUDIO.

Solved Go to solution

Please if anyone can give me good examples of HIve Queries used with RSTUDIO.

 
1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Please if anyone can give me good examples of HIve Queries used with RSTUDIO.

Explorer
@khushi kalra

Take a look at this list of tutorials. They should get you forward a few more steps.

http://henning.kropponline.de/2014/07/13/hive-r/

First, this R/JDBC tutorial (or @Sindhu's post above) can get you through making a database connection. From the link above, you can see a couple of lines where this guy pulls data from a table and does a simple plot.

sample_08<-dbReadTable(conn,"sample_08")
plot(sample_08$sample_08.salary)

You'll probably want to do more sophisticated SQL and plots, though.

The documentation for RJDBC can be found here: https://cran.r-project.org/web/packages/RJDBC/index.html To run an arbitrary query, you use the dbSendQuery() and dbFetch commands as from this tutorial: http://www.inside-r.org/packages/cran/DBI/docs/dbGetQuery

res <- dbSendQuery(con, "SELECT * FROM mtcars WHERE cyl = 4;")
data <- dbFetch(res)

Now 'data' will have the results you can plot.

To do any kind of sophisticated plots in R, the typical thing to do is use the 'ggplots' library. There are lots of tutorials out there. The connection to what you've done with RJDBC is that the 'data' object above is a dataframe that you can use in building your charts. Here's one ggplots tutorial: http://tutorials.iq.harvard.edu/R/Rgraphics/Rgraphics.html

hist(data$some.value)

View solution in original post

5 REPLIES 5
Highlighted

Re: Please if anyone can give me good examples of HIve Queries used with RSTUDIO.

Explorer

There's a really simple example that uses RODBC to query Hive from R. Should work in RStudio just fine, but you might need to adjust some instructions based on your Hive environment versus the HDInsight example.

https://blogs.technet.microsoft.com/meacoex/2014/06/07/connecting-r-to-hdinsight-through-hive/

Highlighted

Re: Please if anyone can give me good examples of HIve Queries used with RSTUDIO.

@khushi kalra

You can also use RJDBC as below to connect to Hive:

library("DBI")

library("rJava")

library("RJDBC")

hive.class.path = list.files(path=c("/usr/hdp/current/hive-client/lib"), pattern="jar", full.names=T);

hadoop.lib.path = list.files(path=c("/usr/hdp/current/hive-client/lib"), pattern="jar", full.names=T);

hadoop.class.path = list.files(path=c("/usr/hdp/2.4.0.0-169/hadoop"), pattern="jar", full.names=T);

cp = c(hive.class.path, hadoop.lib.path, hadoop.class.path, "/usr/hdp/2.4.0.0-169/hadoop-mapreduce/hadoop-mapreduce-client-core.jar")

.jinit(classpath=cp)

drv <- JDBC("org.apache.hive.jdbc.HiveDriver","hive-jdbc.jar",identifier.quote="`")

url.dbc <- paste0("jdbc:hive2://ironhide.hdp.local:10000/default");

conn <- dbConnect(drv, url.dbc, "hive", “redhat");

log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).

log4j:WARN Please initialize the log4j system properly.

log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

dbListTables(conn);

Thanks and Regards,

Sindhu

Highlighted

Re: Please if anyone can give me good examples of HIve Queries used with RSTUDIO.

Explorer

How would you modify this code to connect to the hortonworks HDP sandbox? I know you have to modify your variables url.dbc and conn, but not quite sure how that would work. Could you please help?

Highlighted

Re: Please if anyone can give me good examples of HIve Queries used with RSTUDIO.

I have made the connection already, but i want to analyze the data by plotting graphs.can you help me with that.

Highlighted

Re: Please if anyone can give me good examples of HIve Queries used with RSTUDIO.

Explorer
@khushi kalra

Take a look at this list of tutorials. They should get you forward a few more steps.

http://henning.kropponline.de/2014/07/13/hive-r/

First, this R/JDBC tutorial (or @Sindhu's post above) can get you through making a database connection. From the link above, you can see a couple of lines where this guy pulls data from a table and does a simple plot.

sample_08<-dbReadTable(conn,"sample_08")
plot(sample_08$sample_08.salary)

You'll probably want to do more sophisticated SQL and plots, though.

The documentation for RJDBC can be found here: https://cran.r-project.org/web/packages/RJDBC/index.html To run an arbitrary query, you use the dbSendQuery() and dbFetch commands as from this tutorial: http://www.inside-r.org/packages/cran/DBI/docs/dbGetQuery

res <- dbSendQuery(con, "SELECT * FROM mtcars WHERE cyl = 4;")
data <- dbFetch(res)

Now 'data' will have the results you can plot.

To do any kind of sophisticated plots in R, the typical thing to do is use the 'ggplots' library. There are lots of tutorials out there. The connection to what you've done with RJDBC is that the 'data' object above is a dataframe that you can use in building your charts. Here's one ggplots tutorial: http://tutorials.iq.harvard.edu/R/Rgraphics/Rgraphics.html

hist(data$some.value)

View solution in original post

Don't have an account?
Coming from Hortonworks? Activate your account here