Member since
08-03-2014
15
Posts
0
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
23562 | 01-23-2015 11:14 AM | |
9505 | 08-28-2014 09:22 AM | |
11942 | 08-05-2014 07:18 AM |
01-23-2015
11:14 AM
The problem was indeed in the packaging. I fixed it by including the maven-assembly-plugin. <plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.5.3</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id> <!-- this is used for inheritance merges -->
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
... View more
01-15-2015
07:55 PM
Actually, It doesn't work anywhere. I had tested it working within the IDE now when I packaged it up and sent it to the node it did not work even outside the map partition. NVM must be a packaging error
... View more
01-15-2015
07:06 PM
Here is a snippet of the code: object SparkCSVParser { def main(args: Array[String]) { val sc = new SparkContext(new SparkConf().setAppName("Spark Count")) val restClient = Client.create() val webResource = restClient.resource(tsd) // JSONValue works outside of map partition, class is found tokenized.mapPartitions(lines => { val parser = new CSVParser(',') lines.map( line => { import org.json.simple.JSONValue val columns = parser.parseLine(line) var sensorMetrics = ArrayBuffer[String]() for ((sName, sValuePos) <- nameToPos) { val json = ("metric" -> sName) ~ ("timestamp" -> (hoursInSec * columns(otherNameToPos("HOURS")).toInt).+(today / toSeconds - dayInSec)) ~ ("value" -> columns(sValuePos)) ~ ("tags" -> Map {"engine_index" -> columns(otherNameToPos("ENGINE_INDEX"))}) sensorMetrics += pretty(json) } JSONValue.toJSONString(sensorMetrics) // <- it does not work here. Class is nor found }) }).take(10).foreach( webResource .accept(MediaType.APPLICATION_JSON_TYPE) .`type`(MediaType.APPLICATION_FORM_URLENCODED_TYPE) .post(classOf[ClientResponse], _)) } running on "--master local" Here is the error: 15/01/15 21:43:35 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.NoClassDefFoundError: org/json/simple/JSONValue at com.decisioniq.spark.SparkTrainCSVParser$$anonfun$main$1$$anonfun$apply$1.apply(SparkTrainCSVParser.scala:136) at com.decisioniq.spark.SparkTrainCSVParser$$anonfun$main$1$$anonfun$apply$1.apply(SparkTrainCSVParser.scala:119) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$10.next(Iterator.scala:312) at scala.collection.Iterator$class.foreach(Iterator.scala:727) How can I make it discover this class?
... View more
Labels:
- Labels:
-
Apache Spark
08-28-2014
09:29 AM
I am getting a lot of I/O error constructing remote block reader. When performing batch file uploads to HBase java.io.IOException: Got error for OP_READ_BLOCK, self=/10.2.4.24:43598, remote=/10.2.4.21:50010, for file /user/hbase/.staging/job_1407795783485_1084/libjars/hbase-server-0.98.1-cdh5.1.0.jar, for pool BP-504567843-10.1.1.148-1389898314433 block 1075088397_1099513328611 at org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:432) at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:397) at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:786) c1d001.in.wellcentive.com:50010:DataXceiver error processing READ_BLOCK operation src: /10.2.4.24:43598 dest: /10.2.4.21:50010 org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not found for BP-504567843-10.1.1.148-1389898314433:blk_1075088397_1099513328611 at org.apache.hadoop.hdfs.server.datanode.BlockSender.getReplica(BlockSender.java:419) at org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:228) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:466) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:110) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:229) at java.lang.Thread.run(Thread.java:745) I don't seem to be getting errors during the app processing itself, so I am not sure if this is a problem to worry about, but I would like to know what is causing this so that I may keep my eyes peeled
... View more
Labels:
- Labels:
-
HDFS
08-28-2014
09:22 AM
This looks like a permissions problem, you should look to see who has permissions for creating files under any of the following directories: /logs/prod/apache/2014/08/19/ If it is not root root or something to that liking that allows root to write to it, then you may need to give permissions
... View more
08-08-2014
10:21 AM
Thanks, the link gives a 404 because there is a colon on the end, but I was still able to get there.
... View more
08-08-2014
08:37 AM
Thanks for your response and the documentation. I've read it more carefully now. I have one more scenario to ask about: What does the failure of 2 zookeeper servers in a 3 server cluster imply? Does the final zookeeper server shutdown (not having a master to communicate with?) Does the final zookeeper sever remain as a read only machine. I would appreciate if you could point me to documentation that also talked about this as well.
... View more
08-07-2014
01:36 PM
So I have noticed that operations can still continue when zookeeper fails 1 node of it's 3 node quorum although I understand that it can no longer accept writes because it has no leader (is this right). I know we have to satisfy the ceil(N/2) requirement, but this is to chose leadership it says nothing about how the dependencies (HBase in particular) would be affected. My questions is at it's core, what is the behavior of zookeeper when failure is seen from 3 nodes to 2. Does it turn into a read only coordinator, as an immediate fix can one be assigned master? Thanks.
... View more
Labels:
- Labels:
-
Apache HBase
-
Apache Zookeeper
08-05-2014
07:18 AM
The problem was that the following call: (Found in the error log of the installation UI, check out the original question) python -c 'import socket; import sys; s = socket.socket(socket.AF_INET); s.settimeout(5.0); s.connect((sys.argv[1], int(sys.argv[2]))); s.close();' hadooop-test.in.wellcentive.com 7182 was calling hadooop (threee o's) instead of the name of the server hadoop (two o's) I checked with my systems team and there was a duplicate entry in the dns with the three o's. Fixed and that was teh problem.
... View more
08-05-2014
07:16 AM
This was an upgrade not a first time install, but I followed the upgrade instructions here: http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM5/latest/Cloudera-Manager-Administration-Guide/cm5ag_upgrade_cm5.html?scroll=cmig_topic_9_4 It was a problem with my dns set up.
... View more