Member since
06-23-2016
136
Posts
8
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2703 | 11-24-2017 08:17 PM | |
3192 | 07-28-2017 06:40 AM | |
1233 | 07-05-2017 04:32 PM | |
1381 | 05-11-2017 03:07 PM | |
5519 | 02-08-2017 02:49 PM |
07-04-2017
01:27 PM
Can someone explain to me what I need to do to get Stanford CoreNLP wrapper for Apache Spark to work in Zeppelin/Spark please? I have done this: %spark.dep
z.reset() // clean up previously added artifact and repository
// add artifact recursively
z.load("databricks:spark-corenlp:0.2.0-s_2.10")
and this: import com.databricks.spark.corenlp.functions._
val dfLemmas= filteredDF.withColumn("lemmas", lemmas('noURL)).select("racist", "filtered","noURL", "lemmas")
dfLemmas.show(20, false) but I get this <console>:42: error: not found: value lemmas
val dfLemmas= filteredDF.withColumn("lemmas", lemmas('noURL)).select("racist", "filtered","noURL", "lemmas") Do I have to download the files and build them or something? If so how do I do that? Or is there an easier way? TIA!!!!
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache Zeppelin
06-27-2017
05:40 AM
I cretaed the folder under /user/admin for which I have permissions. Makes sense I suppose.
... View more
06-27-2017
05:26 AM
Thanks Jay, but I was trying to add the folder, not write into it. I was adding it to the root folder /.
... View more
06-26-2017
02:15 PM
Hi, So I log in to Ambari as admin, then I try to add a folder in files view: Permission denied: user=admin, access=WRITE, inode="/ml-in-a-nutshell":hdfs:hdfs:drwxr-xr-x I am accessing Ambari via Chrome under another user, 'ed'. So should I be logging into Ambari as hdfs? Or maybe changing admin permissions? But if I login as hdfs then will it see my existing cluster and what would be the passowrd? This user thing is quite confusing.
... View more
Labels:
06-26-2017
01:43 PM
Yes it did, thanks!
... View more
06-21-2017
03:09 PM
Hi I tried moving my data to a different directory (/data/hdfs/data) by adding the new directory to datanode dir in HDP configs and then copying the data, but I get this error: 2017-06-21 15:29:53,432 ERROR impl.FsDatasetImpl (FsDatasetImpl.java:activateVolume(398)) - Found duplicated storage UUID: DS-011fd6ee-105d-4c21-ba03-8f43bc75f0b2 in /data/hdfs/data/current/VERSION.
2017-06-21 15:29:53,432 ERROR datanode.DataNode (BPServiceActor.java:run(772)) - Initialization failed for Block pool <registering> (Datanode Uuid 18224fd5-7fbe-4700-b22b-64352741f4a7) service to master.royble.co.uk/192.168.1.1:8020. Exiting.
java.io.IOException: Found duplicated storage UUID: DS-011fd6ee-105d-4c21-ba03-8f43bc75f0b2 in /data/hdfs/data/current/VERSION.
at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.activateVolume(FsDatasetImpl.java:399)
at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.addVolume(FsDatasetImpl.java:425)
at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.<init>(FsDatasetImpl.java:329)
at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:34)
at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:30)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1556)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1504)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:319)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:269)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:760)
at java.lang.Thread.run(Thread.java:745) Should I just delete the original copied data files (/hadoop/hdfs/data)? TIA!!
... View more
Labels:
06-07-2017
09:22 AM
Actually presumably hive can't find the 112 version, hence the error. I updated to 112 but the error is still there 😞
... View more
06-07-2017
09:18 AM
Thanks, very helpful! the Hive is 112 as opposed to 111. Does that sound like it would be a problem?
... View more
06-07-2017
07:32 AM
I am trying to use Hive View 2.0 in my cluster (HDP 2.6.0.3 (Ambari 2.5.0.3)) But I get the java.lang.NullPointerException error, and as per this question I have tried everything except checking Java versions. Can anyone tell me how to check the Java use by HDP/Hive? And how do I upgrade my java to match? TIA! My java: java -version
openjdk version "1.8.0_111"
OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-3~14.04.1-b14)
OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode)
... View more
Labels: