Member since
12-16-2015
17
Posts
10
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6130 | 06-13-2016 12:35 PM | |
4830 | 02-03-2016 04:17 PM |
01-27-2017
01:04 PM
Cannoted display/show/print pivoted dataframe in with PySpark. Although apparently created pivoted dataframe fine, when try to show says AttributeError: 'GroupedData' object has no attribute 'show'. Here's the code meterdata = sqlContext.read.format("com.databricks.spark.csv").option("delimiter", ",").option("header", "false").load("/CBIES/meters/")
metercols = meterdata.groupBy("C0").pivot("C1") metercols.show() Output: Traceback (most recent call last): File "/tmp/zeppelin_pyspark-8003809301447367155.py", line 239, in <module>
eval(compiledCode)
File "<string>", line 1, in <module>
AttributeError: 'GroupedData' object has no attribute 'show'
... View more
Labels:
- Labels:
-
Apache Spark
06-13-2016
12:35 PM
2 Kudos
Hello , Thanks everyone. As it turned out, some Ambari features were in maintenance mode, which meant there actually was a discrepancy between the discoverable folder structures. Turning off maintenance mode and rebooting did the trick! Thanks
Aidan
... View more
06-10-2016
11:24 AM
Thanks guys, but those answers aren't quite on point. I suppose the real question is how to access HDFS through SparkR. For example, I know hive tables are accessible, but if they are not in the default /apps/warehouse/ location, how do I find and read them? Thanks a million!
... View more
06-09-2016
02:54 PM
1 Kudo
Hello, I know these questions seem very basic, but there seems to be a discrepancy between the HDFS structure in my sparkR and what I see in Ambari. In SparkR, the default working directory is "/usr/hdp/2.4.0.0-169/spark". But in Ambari, I don't see /usr, but /user, which does contain a /spark directory but this just contains a /.sparkStaging direcotry, which is empty. I have tried to change the workign directory with setwd() but if I just pass directory path as string, e.g. "/user/" it throws error cannot change working directory. I can only seem to change to /tmp. I could include more details, but I think I am missing something basic here, which will probably solve lots of other questions. Help please? Thanks Aidan
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Spark
02-03-2016
04:17 PM
It was because when I thought I was creating an elecMonthly_Orc file, actually I created an elecMonthly_Orc folder with several files three files: _SUCCESS, part-r-r-00000-1a0c14e3-0dd0-42db-abc7-7f655a02f634.orc ... and another similar orc files. The files within the elecMonthly_Orc directory were owned by Hive, and that's why the permissions error. Resolved by using command line as superuser hdfs: hadoop fs -chown admin:admin /user/admin/elecMonthly_Orc/*.* Now I just have to figure out how to recombine Orc files in Hive!
... View more
02-03-2016
11:08 AM
1 Kudo
Thanks Andrew. Think it was discongruity between Hive account running statement and Admin owing file. Thanks for this and sorry for delay in reply.
... View more
01-05-2016
02:43 PM
admin is running statement as per tutorial I thought I did hdfs chown on files. Shown in Ambari as owned by admin
... View more
01-05-2016
02:35 PM
1 Kudo
Thanks for answering so quickly!
... View more
01-05-2016
02:21 PM
Problem loading data to table in this tutorial http://hortonworks.com/hadoop-tutorial/how-to-process-data-with-apache-hive/ This command <code>LOAD DATA INPATH '/user/admin/Batting.csv' OVERWRITE INTO TABLE temp_batting; produces error H110 Unable to submit statement. Error while compiling statement: FAILED:
HiveAccessControlException Permission denied: user [admin] does not have [READ] privilege on
[hdfs://sandbox.hortonworks.com:8020/user/admin/elecMonthly_Orc] [ERROR_STATUS] I created both user/admin and temp/admin folders. I used hdfs superuser to make admin owner of file, folder, and even parent folder. I gave full permissions in HDFS. and this is clearly shown in Ambari. Error persists. Can anyone help? Thanks
... View more
Labels:
- Labels:
-
Apache Hive
12-16-2015
04:02 PM
Labels:
- Labels:
-
Apache HBase