Member since
03-16-2016
707
Posts
1753
Kudos Received
203
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 6974 | 09-21-2018 09:54 PM | |
| 8726 | 03-31-2018 03:59 AM | |
| 2617 | 03-31-2018 03:55 AM | |
| 2754 | 03-31-2018 03:31 AM | |
| 6180 | 03-27-2018 03:46 PM |
12-26-2016
09:44 PM
As you question states it and how this exception block shows it: Caused by: java.lang.NullPointerException at DevMain$anonfun$5.apply(DevMain.scala:2)
at (DevMain.scala:2)
at scala.collection.Iterator$anon$11.next(Iterator.scala:328) Hard to guess without debugging the code. My no-brainer advice :): debug your code line by line to detect when you pass a null value where a non-value is expected.
... View more
12-26-2016
09:33 PM
2 Kudos
@kiran gutha Since Solr 4.7 has been added a class, MiniSolrCloudCluster, that actually "deploys" locally (and if you want ram only or on a temp dir) a complete solr cluster, with zookeeper, shards and everything, for your tests. You can find the jira here : https://issues.apache.org/jira/browse/SOLR-5865 Here is an example: private static MiniSolrCloudCluster miniCluster;
private static CloudSolrServer cloudSolrServer;
@BeforeClass
public static void setup() throws Exception {
miniCluster = new MiniSolrCloudCluster(2, null, new File("src/main/solr/solr.xml"), null, null);
uploadConfigToZk("src/main/solr/content/conf/", "content");
// override settings in the solrconfig include
System.setProperty("solr.tests.maxBufferedDocs", "100000");
System.setProperty("solr.tests.maxIndexingThreads", "-1");
System.setProperty("solr.tests.ramBufferSizeMB", "100");
// use non-test classes so RandomizedRunner isn't necessary
System.setProperty("solr.tests.mergeScheduler", "org.apache.lucene.index.ConcurrentMergeScheduler");
System.setProperty("solr.directoryFactory", "solr.RAMDirectoryFactory");
cloudSolrServer = new CloudSolrServer(miniCluster.getZkServer().getZkAddress(), false);
cloudSolrServer.setRequestWriter(new RequestWriter());
cloudSolrServer.setParser(new XMLResponseParser());
cloudSolrServer.setDefaultCollection("content");
cloudSolrServer.setParallelUpdates(false);
cloudSolrServer.connect();
createCollection(cloudSolrServer, "content", 2, 1, "content");
}
protected static void uploadConfigToZk(String configDir, String configName) throws Exception {
SolrZkClient zkClient = null;
try {
zkClient = new SolrZkClient(miniCluster.getZkServer().getZkAddress(), 10000, 45000, null);
uploadConfigFileToZk(zkClient, configName, "solrconfig.xml", new File(configDir, "solrconfig.xml"));
uploadConfigFileToZk(zkClient, configName, "schema.xml", new File(configDir, "schema.xml"));
uploadConfigFileToZk(zkClient, configName, "stopwords_en.txt", new File(configDir, "stopwords_en.txt"));
uploadConfigFileToZk(zkClient, configName, "stopwords_it.txt", new File(configDir, "stopwords_it.txt"));
System.out.println(zkClient.getChildren(ZkController.CONFIGS_ZKNODE + "/" + configName, null, true));
} finally {
if (zkClient != null)
zkClient.close();
}
}
protected static void uploadConfigFileToZk(SolrZkClient zkClient, String configName, String nameInZk, File file) throws Exception {
zkClient.makePath(ZkController.CONFIGS_ZKNODE + "/" + configName + "/" + nameInZk, file, false, true);
}
@AfterClass
public static void shutDown() throws Exception {
miniCluster.shutdown();
}
protected static NamedList createCollection(CloudSolrServer server, String name, int numShards, int replicationFactor, String configName) throws Exception {
ModifiableSolrParams modParams = new ModifiableSolrParams();
modParams.set(CoreAdminParams.ACTION, CollectionAction.CREATE.name());
modParams.set("name", name);
modParams.set("numShards", numShards);
modParams.set("replicationFactor", replicationFactor);
modParams.set("collection.configName", configName);
QueryRequest request = new QueryRequest(modParams);
request.setPath("/admin/collections");
return server.request(request);
}
@Test
public void test() throws Exception {
// Do you stuff here using cloudSolrServer as a normal solrServer
}
... View more
12-26-2016
09:31 PM
2 Kudos
@Anand Verma Exception in thread "main" java.lang.RuntimeException: org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [sandbox.hortonworks.com]. Did you specify a valid list of nimbus hosts for config nimbus.seeds? Check this: https://community.hortonworks.com/articles/8844/solutions-for-storm-nimbus-failure.html
... View more
12-26-2016
09:24 PM
2 Kudos
@Timothy Spann What version of Ambari? I read the question caption "Blank Machine" and understand that as only OS and that you installed a fresh version of Ambari first. I have seen this issue with versions up-to 2.2.2.
... View more
12-26-2016
09:02 PM
2 Kudos
Introduction
h2o is a package for running H2O via its REST API from within R. This package allows the user to run basic H2O commands using R commands. No actual data is stored in the R workspace; and no actual work is carried out by R. R only saves the named objects, which uniquely identify the data set, model, etc. on the server. When the user makes a request, R queries the server via the REST API, which returns a JSON file with the relevant information that R then displays in the console.
Scope
I tested this installation guide on CentOS 7.2, but it
should work on similar RedHat/Fedora/Centos…
Steps
1. Install R
sudo yum install R
2. Install Java
https://www.java.com/en/download/help/linux_x64rpm_install.xml
3. Start R and install dependencies
install.packages(RCurl)
install.packages(bitops)
install.packages(rjson)
install.packages(statmod)
install.packages(tools)
4. Install h20 package and load library for use
install.packages("h2o").
library(h2o)
If this is your first time using CRAN4 it will ask for a
mirror to use. If you want H2O installed site-wide (i.e., usable by all users
on that machine), run R as root, sudo R, then type
install.packages("h2o").
5. Test H2O installation
Type:
library(h2o)
If nothing complains, launch h2o:
h2o.init().
If all went well then you’ll see lots of output about how it
is starting up H2O on your behalf, and then it should tell you all about your
cluster. If not, the error message should be telling you what dependency is
missing, or what the problem is. Post a note to this article and I will get
back to you.
Tips
#1 - The version of H2O on CRAN might be up to a month or two
behind the latest and greatest. Unless you are affected by a bug that you know
has been fixed, don’t worry about it.
#2- h2o.init() will only use two cores on your machine and maybe
a quarter of your system memory, 6 by default. To resize resource, use h2o.shutdown() and start it again:
a) using all your cores:
h2o.init(nthreads = -1)
b) using all your cores and 4 GB:
h2o.init(nthreads = -1, max_mem_size = "4g")
#3 - To run H2O on your local machine, you could call h2o.init without any
arguments, and H2O will be automatically launched at localhost:54321, where the
IP is "127.0.0.1" and the port is 54321.
#4 - If H2O is running on a
cluster, you must provide the IP and port of the remote machine as arguments to
the h2o.init() call. The operation will be done on the server associated with
the data object where H2O is running, not within the R environment. Tutorials
H2O Tutorial on the Hortonworks Data Platform Sandbox:
http://hortonworks.com/blog/oxdata-h2o-tutorial-hortonworks-sandbox/
Walk-Though Tutorials for Web UI:
http://h2o-release.s3.amazonaws.com/h2o/rel-lambert/5/docs-website/tutorial/top.html
... View more
Labels:
12-26-2016
08:15 PM
@ALFRED CHAN It is present in Oregon too. Ohio is a new region that was just added by Amazon. We will upload that image in that region too.
... View more
12-26-2016
08:10 PM
@Rishit shah It seems that you found the response to your own question. I believe that the new question should be separate from the current question. We want to prevent open-ended questions. Could you open a different question and notify @Artem Ervits? This way, he can place his response and if that helps, please vote and accept his response or any response that makes sense to you on the new question. That question is worth it a full discussion and it is of larger interest.
... View more
12-26-2016
08:06 PM
2 Kudos
@Santhosh B Gowda Was this a fresh install or an upgrade from an older version of HDP? If this was an upgrade, this thread may be useful: http://stackoverflow.com/questions/33852044/why-can-i-not-read-from-the-aws-s3-in-spark-application-anymore As I see in your last post, you mention a path /usr/hdp/2.5.3.0-14/hadoop/lib/jets3t-0.9.0.jar, could you also run the following and post the result? ls -lrt /usr/hdp/
... View more
12-26-2016
07:57 PM
1 Kudo
In your code, you may have to add the following line: conf.set("dfs.nameservices", "HadoopTestHA")
... View more
12-26-2016
07:54 PM
2 Kudos
@Mon key In HDFS, reads normally go through the DataNode. Thus, when the client asks the DataNode to read a file, the DataNode reads that file off of the disk and sends the data to the client over a TCP socket. So-called “short-circuit” reads bypass the DataNode, allowing the client to read the file directly. Obviously, this is only possible in cases where the client is co-located with the data. Short-circuit reads provide a substantial performance boost to many applications. To configure short-circuit local reads, you must enable libhadoop.so . See Native Libraries for details on enabling this library. Windows is not a supported OS. You need to turn off this feature and re-execute your job.
... View more