Member since
02-02-2016
583
Posts
518
Kudos Received
98
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3273 | 09-16-2016 11:56 AM | |
1375 | 09-13-2016 08:47 PM | |
5482 | 09-06-2016 11:00 AM | |
3181 | 08-05-2016 11:51 AM | |
5261 | 08-03-2016 02:58 PM |
06-08-2016
09:02 PM
2 Kudos
AFAIK. We don't support this kind of installation with HDP. For logs location, yes we can change the log directory for each component in config.
... View more
06-08-2016
04:38 PM
2 Kudos
@Smart Solutions I have seen application_* files inside /spark-history directory but don't know from where you got ".a5555e556-3301-433e-44de-23311665ed. Can you check the content of this file/dir? Also what if you move this file/dir from other location and restart the spark history server?
... View more
06-08-2016
03:49 PM
2 Kudos
@Banana Joe It may be related to the resources in VM, please see the doc for resource requirement. http://hortonworks.com/wp-content/uploads/2016/02/Import_on_Vbox_3_1_2016.pdf RAM - "At least 8 GB of RAM (The more, the better)
If you wish to enable services such as Ambari, HBase, Storm, Kafka, or Spark
please ensure you have at least 10 Gb of physical RAM in order to run the
VM using 8 GB".
... View more
06-08-2016
03:43 PM
5 Kudos
@Hamza FRIOUA Best option would be using Mongo hadoop connector with hive external tables but you need to built that jar manually or use prebuilt. https://github.com/mongodb/mongo-hadoop/wiki/Hive-Usage CREATE TABLE individuals
(
id INT,
name STRING,
age INT,
work STRUCT<title:STRING, hours:INT>
)
STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES('mongo.columns.mapping'='{"id":"_id","work.title":"job.position"}')
TBLPROPERTIES('mongo.uri'='mongodb://localhost:27017/test.persons');
... View more
06-08-2016
03:14 PM
@Benjamin Leonhardi how --files is differ from SparkContext.addFile() apart from the way we use them?
... View more
06-08-2016
02:54 PM
Took 65944ms to send a batch of 1 edits (205 bytes) to remote journal 192.168.1.47:8485 2016-06-02 You either have a serious network problem b/w nodes or may be underlining disk write is very slow.
... View more
06-08-2016
02:34 PM
@clukasik I don't see any performance issue if running it on yarn-client mode however as per initial info they needs to use distributed cache kind of thing in spark, which they can achieve through SparkContext.addFile()
... View more
06-08-2016
01:54 PM
@Eric Periard Technically two NN can't be at same status if its happening then either you have configuration issues or hitting some bug.
... View more
06-08-2016
01:47 PM
@akeezhadath Kindly use below API to cache the file on all the nodes. SparkContext.addFile() Add a file to be downloaded with this Spark job on every node. The path passed can be either a local file, a file in HDFS (or other Hadoop-supported filesystems), or an HTTP, HTTPS or FTP URI. To access the file in Spark jobs, use SparkFiles.get(fileName) to find its download location. A directory can be given if the recursive option is set to true. Currently directories are only supported for Hadoop-supported filesystems.
... View more
06-08-2016
01:30 PM
4 Kudos
@akeezhadath You can place the file on HDFS and access the file through "hdfs:///path/file".
... View more