Member since
06-09-2016
529
Posts
129
Kudos Received
104
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1737 | 09-11-2019 10:19 AM | |
| 9341 | 11-26-2018 07:04 PM | |
| 2490 | 11-14-2018 12:10 PM | |
| 5335 | 11-14-2018 12:09 PM | |
| 3152 | 11-12-2018 01:19 PM |
05-25-2018
03:38 PM
@Tamil Selvan K HTTP is more firewall friendly protocol and this is usually the reason why you may endup using it when you need to connect to hive from remote clients. Keep in mind you can have single protocol configured for each hiveserver2. However you can have multiple hiveserver2 on your cluster. If necessary you could have some hiveserver2 services configured with binary and some configured with http. HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
05-25-2018
03:27 PM
@Mokkan Mok NN does not write blocks to DN, only client to DN and DN to DN (depending on replication factor). Client to DN depends on client you are using. If you are using webhdfs you will be using HTTP for example. Other clients like hdfs use RPC protocol. I think DN to DN replication is always RPC. HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
05-25-2018
03:21 PM
1 Kudo
@bharat sharma Notebooks are not python modules. If you are trying to import a notebook as if it was a python module AFAIK that won't work. If you are trying to import modules to pyspark application you have different ways to do this. One way is to copy the python file to hdfs and use the following: %pyspark
sc.addPyFile("/user/zeppelin/my_settings.py")
import my_settings HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
05-24-2018
02:33 PM
@Mokkan Mok Yes, Namenode gives the delegation token. Command line tool is: # hdfs fetchdt More on it here: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#fetchdt Note: If you are satisfied with the answer, please take a moment to login and click the "accept" link on the answer.
... View more
05-24-2018
02:26 PM
@Pramod GM Have you tried yarn-client mode? I would recommend you test using spark-shell with same configuration arguments and see if running a simple sc.textFile("hdfs://...") works or not. Try to point directly to the active NN with port and without port. Are both clusters Name nodes configured in HA? HTH
... View more
05-24-2018
02:20 PM
@Mokkan Mok 1. We can get delegation token and even if we kdestroy the tickets, we can still access using delegation token? Yes, the following hc link shows exactly this with an example https://community.hortonworks.com/articles/50069/demystifying-delegation-token.html 2. Is delegation token part of kerberos or just depend on kereberos? Delegation token is not part of kerberos. But in order to get a delegation token you need to have a valid kerberos token. 3. Is it just a separate package? Each hadooop service like HDFS, YARN, HIVE, HBASE client api provides a way to fetch delegation tokens. Each delagation token has expiration and max issue date. As long as is valid clients can use the delegation token to authenticate with the service. HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
05-23-2018
03:48 PM
@vishal dutt Try this only replace the path on jars and make sure sqlserver.py on working directory (rest leave it as is) spark-submit --master yarn --deploy-mode cluster --jars /path/to/driver/sqljdbc42.jar --conf "spark.driver.extraClassPath=sqljdbc42.jar" --conf "spark.executor.extraClassPath=sqljdbc42.jar" sqlserver.py HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
05-23-2018
01:21 AM
@skekatpuray I see you are using session api instead of batches. Try running with curl -X POST --data '{"kind":"pyspark", "conf":{ "pyFiles" : "/user/skekatpu/pw/codebase/splitter.py"} }'-H "Content-Type: application/json"-H "X-Requested-By: root" http://localhost:8999/batches HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
05-22-2018
10:06 PM
1 Kudo
@skekatpuray --py-files is for command line only. Try using spark.submit.pyFiles instead with Livy. You should add this via Spark configurations in "conf" field of REST. Check this link for more information: https://community.hortonworks.com/articles/151164/how-to-submit-spark-application-through-livy-rest.html Perhaps those pyFiles you should add to hdfs and point from hdfs instead from file system level, since those wont be present for Livy locally. HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more