About cstanca

cstanca · ‎12-26-2016

As you question states it and how this exception block shows it: Caused by: java.lang.NullPointerException at DevMain$anonfun$5.apply(DevMain.scala:2) at (DevMain.scala:2) at scala.collection.Iterator$anon$11.next(Iterator.scala:328) Hard to guess without debugging the code. My no-brainer advice :): debug your code line by line to detect when you pass a null value where a non-value is expected.

cstanca · ‎12-26-2016

@kiran gutha Since Solr 4.7 has been added a class, MiniSolrCloudCluster, that actually "deploys" locally (and if you want ram only or on a temp dir) a complete solr cluster, with zookeeper, shards and everything, for your tests. You can find the jira here : https://issues.apache.org/jira/browse/SOLR-5865 Here is an example: private static MiniSolrCloudCluster miniCluster; private static CloudSolrServer cloudSolrServer; @BeforeClass public static void setup() throws Exception { miniCluster = new MiniSolrCloudCluster(2, null, new File("src/main/solr/solr.xml"), null, null); uploadConfigToZk("src/main/solr/content/conf/", "content"); // override settings in the solrconfig include System.setProperty("solr.tests.maxBufferedDocs", "100000"); System.setProperty("solr.tests.maxIndexingThreads", "-1"); System.setProperty("solr.tests.ramBufferSizeMB", "100"); // use non-test classes so RandomizedRunner isn't necessary System.setProperty("solr.tests.mergeScheduler", "org.apache.lucene.index.ConcurrentMergeScheduler"); System.setProperty("solr.directoryFactory", "solr.RAMDirectoryFactory"); cloudSolrServer = new CloudSolrServer(miniCluster.getZkServer().getZkAddress(), false); cloudSolrServer.setRequestWriter(new RequestWriter()); cloudSolrServer.setParser(new XMLResponseParser()); cloudSolrServer.setDefaultCollection("content"); cloudSolrServer.setParallelUpdates(false); cloudSolrServer.connect(); createCollection(cloudSolrServer, "content", 2, 1, "content"); } protected static void uploadConfigToZk(String configDir, String configName) throws Exception { SolrZkClient zkClient = null; try { zkClient = new SolrZkClient(miniCluster.getZkServer().getZkAddress(), 10000, 45000, null); uploadConfigFileToZk(zkClient, configName, "solrconfig.xml", new File(configDir, "solrconfig.xml")); uploadConfigFileToZk(zkClient, configName, "schema.xml", new File(configDir, "schema.xml")); uploadConfigFileToZk(zkClient, configName, "stopwords_en.txt", new File(configDir, "stopwords_en.txt")); uploadConfigFileToZk(zkClient, configName, "stopwords_it.txt", new File(configDir, "stopwords_it.txt")); System.out.println(zkClient.getChildren(ZkController.CONFIGS_ZKNODE + "/" + configName, null, true)); } finally { if (zkClient != null) zkClient.close(); } } protected static void uploadConfigFileToZk(SolrZkClient zkClient, String configName, String nameInZk, File file) throws Exception { zkClient.makePath(ZkController.CONFIGS_ZKNODE + "/" + configName + "/" + nameInZk, file, false, true); } @AfterClass public static void shutDown() throws Exception { miniCluster.shutdown(); } protected static NamedList createCollection(CloudSolrServer server, String name, int numShards, int replicationFactor, String configName) throws Exception { ModifiableSolrParams modParams = new ModifiableSolrParams(); modParams.set(CoreAdminParams.ACTION, CollectionAction.CREATE.name()); modParams.set("name", name); modParams.set("numShards", numShards); modParams.set("replicationFactor", replicationFactor); modParams.set("collection.configName", configName); QueryRequest request = new QueryRequest(modParams); request.setPath("/admin/collections"); return server.request(request); } @Test public void test() throws Exception { // Do you stuff here using cloudSolrServer as a normal solrServer }

cstanca · ‎12-26-2016

@Anand Verma Exception in thread "main" java.lang.RuntimeException: org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [sandbox.hortonworks.com]. Did you specify a valid list of nimbus hosts for config nimbus.seeds? Check this: https://community.hortonworks.com/articles/8844/solutions-for-storm-nimbus-failure.html

cstanca · ‎12-26-2016

@Timothy Spann What version of Ambari? I read the question caption "Blank Machine" and understand that as only OS and that you installed a fresh version of Ambari first. I have seen this issue with versions up-to 2.2.2.

cstanca · ‎12-26-2016

Introduction h2o is a package for running H2O via its REST API from within R. This package allows the user to run basic H2O commands using R commands. No actual data is stored in the R workspace; and no actual work is carried out by R. R only saves the named objects, which uniquely identify the data set, model, etc. on the server. When the user makes a request, R queries the server via the REST API, which returns a JSON file with the relevant information that R then displays in the console. Scope I tested this installation guide on CentOS 7.2, but it should work on similar RedHat/Fedora/Centos… Steps 1. Install R sudo yum install R 2. Install Java https://www.java.com/en/download/help/linux_x64rpm_install.xml 3. Start R and install dependencies install.packages(RCurl) install.packages(bitops) install.packages(rjson) install.packages(statmod) install.packages(tools) 4. Install h20 package and load library for use install.packages("h2o"). library(h2o) If this is your first time using CRAN4 it will ask for a mirror to use. If you want H2O installed site-wide (i.e., usable by all users on that machine), run R as root, sudo R, then type install.packages("h2o"). 5. Test H2O installation Type: library(h2o) If nothing complains, launch h2o: h2o.init(). If all went well then you’ll see lots of output about how it is starting up H2O on your behalf, and then it should tell you all about your cluster. If not, the error message should be telling you what dependency is missing, or what the problem is. Post a note to this article and I will get back to you. Tips #1 - The version of H2O on CRAN might be up to a month or two behind the latest and greatest. Unless you are affected by a bug that you know has been fixed, don’t worry about it. #2- h2o.init() will only use two cores on your machine and maybe a quarter of your system memory, 6 by default. To resize resource, use h2o.shutdown() and start it again: a) using all your cores: h2o.init(nthreads = -1) b) using all your cores and 4 GB: h2o.init(nthreads = -1, max_mem_size = "4g") #3 - To run H2O on your local machine, you could call h2o.init without any arguments, and H2O will be automatically launched at localhost:54321, where the IP is "127.0.0.1" and the port is 54321. #4 - If H2O is running on a cluster, you must provide the IP and port of the remote machine as arguments to the h2o.init() call. The operation will be done on the server associated with the data object where H2O is running, not within the R environment. Tutorials H2O Tutorial on the Hortonworks Data Platform Sandbox: http://hortonworks.com/blog/oxdata-h2o-tutorial-hortonworks-sandbox/ Walk-Though Tutorials for Web UI: http://h2o-release.s3.amazonaws.com/h2o/rel-lambert/5/docs-website/tutorial/top.html

cstanca · ‎12-26-2016

@ALFRED CHAN It is present in Oregon too. Ohio is a new region that was just added by Amazon. We will upload that image in that region too.

cstanca · ‎12-26-2016

@Rishit shah It seems that you found the response to your own question. I believe that the new question should be separate from the current question. We want to prevent open-ended questions. Could you open a different question and notify @Artem Ervits? This way, he can place his response and if that helps, please vote and accept his response or any response that makes sense to you on the new question. That question is worth it a full discussion and it is of larger interest.

cstanca · ‎12-26-2016

@Santhosh B Gowda Was this a fresh install or an upgrade from an older version of HDP? If this was an upgrade, this thread may be useful: http://stackoverflow.com/questions/33852044/why-can-i-not-read-from-the-aws-s3-in-spark-application-anymore As I see in your last post, you mention a path /usr/hdp/2.5.3.0-14/hadoop/lib/jets3t-0.9.0.jar, could you also run the following and post the result? ls -lrt /usr/hdp/

cstanca · ‎12-26-2016

In your code, you may have to add the following line: conf.set("dfs.nameservices", "HadoopTestHA")

cstanca · ‎12-26-2016

@Mon key In HDFS, reads normally go through the DataNode. Thus, when the client asks the DataNode to read a file, the DataNode reads that file off of the disk and sends the data to the client over a TCP socket. So-called “short-circuit” reads bypass the DataNode, allowing the client to read the file directly. Obviously, this is only possible in cases where the client is co-located with the data. Short-circuit reads provide a substantial performance boost to many applications. To configure short-circuit local reads, you must enable libhadoop.so . See Native Libraries for details on enabling this library. Windows is not a supported OS. You need to turn off this feature and re-execute your job.

Online	Offline
Last Visited	‎03-22-2019 03:12 AM

Member Since	‎03-16-2016 04:06 PM
Last Visited	‎03-22-2019 03:12 AM
Posts	707
Kudos received	1728

Cloudera Community

Re: 5th attempt at getting an answer to this quest...

Re: Trying to reinstall Apache NiFi 1.5 on HDF 3.1

Re: Is it mandatory that we should have exact moun...

Re: Alternate to smartsense

Re: Tracking of Hive tables metadata changes in re...

Re: SparkException org.apache.spark.SparkException...

Re: Solr: How to test solr distributed component?

Re: Could not find leader nimbus from seed hosts [...

Re: Ambari Server Error on Blank Machine Installin...

Install H2O with R

Re: HDPCD developer practice exam AMI missing in A...

Re: getting error while submitting flume with hive...

Re: unable to write hive query output to s3

Re: ViewFS use in java application

Re: Win7 submit mapreduce with problem Stack trace...