About Rakesh Gupta

Rakesh Gupta · ‎02-02-2017

And guessing from what is happening here I thought I should try starting up the thrift server in http mode on an http port. but http port for thrift server is not defined anywhere by Ambari, so created an custom propery for thrift server: 'hive.server2.thrift.http.port: 10013' and defined 'hive.server2.transport.mode: http' And it started thrift server on a different port (10013) in http mode! Regards Rakesh

Rakesh Gupta · ‎02-02-2017

Hi @Smart Solutions I am lil late to the party, but I was able to run both HS2 and STS on the same machine on a kerberized cluster using: HS2 => hive.server2.thrift.http.port: 10001, transportmode: http, STS => hive.server2.thrift.port:10015, hive.server2.transport.mode: binary STS does not start and throws the bind exception when I use: STS => hive.server2.thrift.port:10015, hive.server2.transport.mode: http So changing transport mode for STS to binary works for me. Tested on HDP 2.4.2.29-4 Regards Rakesh

Rakesh Gupta · ‎08-14-2016

Thanks Artem, you are correct, but due to some constraints we can not wait until upgrade. I am unable to find a fix for this.

Rakesh Gupta · ‎08-11-2016

We are using spark 1.3 on hdp2.2.4 and I found there is a bug in the spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar that ships with spark. the Mllib check for the version of numpy is incorrect and MLlib throws an exception. I know the fix, I have to change the below file in the jar: mllib/__init__.py" below is the current code in the above mention python file: import numpy if numpy.version.version < '1.4': raise Exception("MLlib requires NumPy 1.4+") It can be fixed by changing to: import numpy ver = [int(x) for x in numpy.version.version.split('.')[:2]] if ver < [1, 4]: raise Exception("MLlib requires NumPy 1.4+") I have tried editing the 'spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar', to correct the code. I un-zipped the jar file, fixed the code, re packed it using zip. But after placing the fix, it gives EOF error: Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 6, xxxxxx.xxxx.uk.hxxx): org.apache.spark.SparkException: Error from python worker: /opt/anaconda/envs/sparkAnaconda/bin/python: No module named pyspark PYTHONPATH was: /data/4/hadoop/yarn/local/usercache/xxxxxxxx/filecache/33/spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163) at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:86) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:105) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)

Rakesh Gupta · ‎07-20-2016

I am facing a simillar issue, i am kind of new to the kms. it would really help if you can elaborate on the steps.

Rakesh Gupta · ‎06-17-2016

Many Thanks for sharing this! it worked for me as well but I am not sure if this is the correct way of fixing it, or is it a only work around only? I need to put a fix in the production env for the same.

Rakesh Gupta · ‎12-18-2015

Thanks Billie for your response! I was able run solr on yarn, the mistake was "site.global.app_root" did not have the correct name of my solr version which was solr-5.3.1 However when I stop the solr application via slider (slider stop solr-yarn8) and restart it, 1) the cores I created disappear and, which is bad. 2) new instances start on new ports, can I fix the ports? 3) also I am only able to connect to only one of the solr instances (solr UI). 4) Is it yet possible to deploy solr cloud on yarn using multiple instances of solr? Regards, Rakesh

Rakesh Gupta · ‎12-11-2015

Thanks for the response, but the slider application failed to start again. When I look at the HDFS path: [solr@sandbox solr-slider]$ hadoop fs -cat /user/solr/.slider/cluster/solr-yarn4/app_config.json { "schema" : "http://example.org/specification/v2.0.0", "metadata" : { }, "global" : { "site.global.gc_tune" : "-XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000 -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime", "site.fs.default.name" : "hdfs://sandbox.hortonworks.com:8020", "site.global.solr_host" : "${SOLR_HOST}", "site.global.solr_opts" : "", "zookeeper.hosts" : "sandbox.hortonworks.com", "site.global.server_module" : "--module=http", "site.global.stop_key" : "solrrocks", "java_home" : "/usr/lib/jvm/java-1.7.0-openjdk.x86_64/", "site.fs.defaultFS" : "hdfs://sandbox.hortonworks.com:8020", "site.global.zk_timeout" : "15000", "env.MALLOC_ARENA_MAX" : "4", "zookeeper.path" : "/services/slider/users/solr/solr-yarn4", "site.global.listen_port" : "8983", "zookeeper.quorum" : "sandbox.hortonworks.com:2181", "site.global.xmx_val" : "1g", "site.global.zk_host" : "${ZK_HOST}", "site.global.app_root" : "${AGENT_WORK_ROOT}/app/install/solr-5.3.1-SNAPSHOT", "application.def" : "/user/solr/.slider/package/solr-yarn/solr-on-yarn.zip", "site.global.xms_val" : "1g" }, "credentials" : { }, "components" : { "slider-appmaster" : { "jvm.heapsize" : "512M" }, "SOLR" : { } } - The variable names "${ZK_HOST}" shoul nt they be replaced with actual values? - Where should I look for the Solr specific logs as I am not able to find anything in the container logs. - What is the value of ${AGENT_WORK_ROOT}? what is the absolute path? - Is there any detailed documentation on how to deploy Solr application on yarn via Slider. Regards,

Rakesh Gupta · ‎12-10-2015

@Gour Saha

Rakesh Gupta · ‎12-10-2015

Hi, I am trying to run solr on yarn using the link lucidworksSolrSlider, apart from taking help from slider.incubator.apache.org/docs/getting_started.html Here is my folder structure: [solrs@ip-10-0-0-217 solr-slider]$ ls -lrt total 131744 -rw-rw-r--. 1 solrs solrs 3182 Dec 10 01:17 README.md drwxrwxr-x. 4 solrs solrs 32 Dec 10 01:17 package -rw-rw-r--. 1 solrs solrs 2089 Dec 10 01:17 metainfo.xml -rw-rw-r--. 1 solrs solrs 11358 Dec 10 01:17 LICENSE -rw-rw-r--. 1 solrs solrs 134874517 Dec 10 01:37 solr-on-yarn.zip -rw-rw-r--. 1 solrs solrs 277 Dec 10 01:49 resources-default.json -rw-rw-r--. 1 solrs solrs 1355 Dec 10 15:33 appConfig-default.json appConfig-default.json: { "schema": "http://example.org/specification/v2.0.0", "metadata": { }, "global": { "application.def": "/user/solrs/.slider/package/solryarn/solr-on-yarn.zip", "java_home": "/usr/jdk64/jdk1.8.0_40", "site.global.app_root": "${AGENT_WORK_ROOT}/app/install/solr-5.2.0-SNAPSHOT", "site.global.zk_host": "localhost:2181", "site.global.solr_host": "${SOLR_HOST}", "site.global.listen_port": "${SOLR.ALLOCATED_PORT}", "site.global.xmx_val": "1g", "site.global.xms_val": "1g", "site.global.gc_tune": "-XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewG$ "site.global.zk_timeout": "15000", "site.global.server_module": "--module=http", "site.global.stop_key": "solrrocks", "site.global.solr_opts": "" }, "components": { "slider-appmaster": { "jvm.heapsize": "512M" }, "SOLR": { } } } resources-default.json: { "schema" : "http://example.org/specification/v2.0.0", "metadata" : { }, "global" : { }, "components": { "slider-appmaster": { }, "SOLR": { "yarn.role.priority": "1", "yarn.component.instances": "3", "yarn.memory": "1024" } } } Could you please suggest me what will be the value of below parameters in appConfig-default.json file: "site.global.app_root": "${AGENT_WORK_ROOT}/app/install/solr-5.2.0-SNAPSHOT", "site.global.solr_host": "${SOLR_HOST}", "site.global.listen_port": "${SOLR.ALLOCATED_PORT}", Basically where should I find "/app/install/solr-5.2.0-SNAPSHOT"?? My Environment: HDP 2.3, Slider Core-0.80.0.2.3.2.0-2950 Thanks, hoping a quick reply.

Online	Offline
Last Visited	‎05-01-2015 05:19 AM

Member Since	‎11-11-2014 05:03 AM
Last Visited	‎05-01-2015 05:19 AM
Posts	21
Kudos received	3

Cloudera Community

Re: Can we not have HS2 and Spark Thrift Server (S...

Re: Can we not have HS2 and Spark Thrift Server (S...

Re: Bug in spark-assembly-1.3.1.2.3.0.0-2557-hadoo...

Bug in spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7...

Re: HDFS rest encryption zone unable to find valid...

Re: Number of column limitations in hive over hbas...

Re: Installing Solr on yarn using Slider

Re: Installing Solr on yarn using Slider

Re: Installing Solr on yarn using Slider

Installing Solr on yarn using Slider