Reply
Explorer
Posts: 19
Registered: ‎02-15-2016
Accepted Solution

Unable to run R code to read from Hive Table

Hi Everyone,

 

I am new to R and after reading a number of articles online, I thought of playing with it in Cloudera Data Science Workbench.

 

I have installed Sparklyr and trying to run the following code:

========================================================================== 

library(shiny)
library(ggplot2)
library(dygraphs)
library(circlize)
library(stringr)
library(xts)
library(sparklyr)
library(dplyr)

installLoadPkgs <- function(pkgList)
{
print(pkgList)
pkgsToLoad <- pkgList[!(pkgList %in% installed.packages()[,"Package"])];

if(length(pkgsToLoad)) {
install.packages(pkgsToLoad, dependencies = TRUE);
}

for(package_name in pkgList) {
library(package_name, character.only=TRUE, quietly=FALSE);
}
}

# install / load packages
startTime <- proc.time()
pkgs <- c("sparklyr", "dplyr", "ggplot2","maps", "geosphere", "DBI", "dygraphs", "circlize", "stringr", "xts")
proc.time() - startTime
installLoadPkgs(pkgs)

sc <- spark_connect(master="yarn-client", app_name='manual_monitoring')


assetstatuses_tbl <- copy_to(sc, "assetstatuses")

latestAssets_tbl <- assetstatuses_tbl %>%
filter(createdat > 2017-06-26) %>%
group_by(assetid, id) %>%
select(assetid, id)
sdf_register(latestAssets_tbl, "latestAssets")

==================================================================================== 

However, it's failing with the following error. Could anyone help me troubleshoot this?

 

assetstatuses_tbl <- copy_to(sc, "assetstatuses")
Error: java.lang.IllegalArgumentException: Invaid type null
	at sparklyr.SQLUtils$.getSQLDataType(sqlutils.scala:60)
	at sparklyr.SQLUtils$.createStructField(sqlutils.scala:65)
	at sparklyr.SQLUtils.createStructField(sqlutils.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at sparklyr.Invoke$.invoke(invoke.scala:94)
	at sparklyr.StreamHandler$.handleMethodCall(stream.scala:89)
	at sparklyr.StreamHandler$.read(stream.scala:55)
	at sparklyr.BackendHandler.channelRead0(handler.scala:49)
	at sparklyr.BackendHandler.channelRead0(handler.scala:14)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:267)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
	at java.lang.Thread.run(Thread.java:745)
In addition: Warning message:
In as.data.frame.vector(x, ..., nm = nm) :
  'row.names' is not a character vector of length 1 -- omitting it. Will be an error!.  

 

Cloudera Employee
Posts: 29
Registered: ‎04-28-2017

Re: Unable to run R code to read from Hive Table

Hi,

 

Sparklyr is supported by RStudio, so it may be better to ask this question directory to RStudio or in a forum like StackOverflow.   However looking at the code it appears you are passing a string to copy_to rather than a dataframe.  If assetstatuses is a dataframe that is available, you can try copying it with copy_to(sc, assetstatuses) without quotes around assetstatuses.

 

See: http://spark.rstudio.com/reference/sparklyr/latest/copy_to.html

 

Best,
Tristan

Highlighted
Explorer
Posts: 19
Registered: ‎02-15-2016

Re: Unable to run R code to read from Hive Table

Thanks Tristan!

 

I had found that mistake and corrected it. Thanks for your response.

 

Regards,

MG

Announcements