Member since
05-10-2017
35
Posts
1
Kudos Received
0
Solutions
11-05-2018
12:25 PM
@Muhammad Umar Each executor can have 1 or more threads to perform parallel computation. In yarn master mode the default is 1. These threads can be increased by using the command line parameter --executor-cores. a) if I have specified 8 num_executors for an application and I dont set executor-cores so will each executor going to use all the cores ? In yarn master mode the default is 1, therefore each executor will use only 1 core by default. b) As each node has 8 cores, so what if I specify executor_cores = 4 so that means it will limit executor cores to be used for an executor should not exceed 4 while total cores per node are 8 ? Assignment of cores is static, not dynamic. And it will remain same during the execution of the application. If you set executor cores to 4 this means that each executor will start and run using 4 cores/threads to perform parallel computation. c) What is the criteria to specify executor_cores for a spark application? Increasing executor cores affects performance. You need to take in consideration the number of virtual cores available in each node and my recommendation is you should not increase this over 4 in most cases. HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
07-26-2018
06:31 AM
1 Kudo
@ Muhammad Umar If you have run the spark job in yarn cluster mode ,then you can get both driver and executor container logs through yarn logs -applicationId <appId>, if it was a case of yarn client mode executor container logs would be available through yarn logs -applicationId <appId>, but driver logs are available on console. You can collect the driver logs to a file by configuring the log4j and passing to driver options, explained @ https://community.hortonworks.com/articles/138849/how-to-capture-spark-driver-and-executor-logs-in-y.html
... View more
07-19-2018
12:59 PM
@Muhammad Umar What python version are you using? One of the imports seems to point to python3 - If that is the case you will need to export few environment settings in order for this to run correctly. Check: https://community.hortonworks.com/questions/138351/how-to-specify-python-version-to-use-with-pyspark.html When running on yarn master deployment mode client the executors will run on any of the cluster worker nodes. This means that you need to make sure all the necessary python libraries you are using along with python desired version is installed on all cluster worker nodes in advanced. Finally, it would be good to have both the driver log (which is printed to stdout of spark-submit) along complete yarn logs -applicationId <appId> for further diagnosis. HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
... View more
06-11-2017
08:54 PM
It looks like you are trying to connect via the Zookeeper port, but there are some issues with this: You'll likely have trouble because of the issue described in NIFI-2575. Even if that were not the issue, the following two things would have to be done. I believe you'd have to set some variables in the URL (such as serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2) The Zookeeper port is not exposed via the sandbox by default, you would have to forward that port if using NAT. For these reasons, I recommend you connect via the standard HiveServer2 port (10000), which is exposed by default for the sandbox's NAT configuration.
... View more
07-05-2019
07:00 PM
ExecuteStreamCommand processor can be event driven. Use this to run a shell script which moves the file into the folder when getfile is monitoring.
... View more
05-24-2017
12:15 PM
@Muhammad Umar This Error is being thrown by your MoveFile.bash script and not by NiFi. NiFi is simply capturing the error response from the script and putting in tin the NiFi app log. Thanks, Matt
... View more
05-19-2017
12:36 PM
@Muhammad Umar What is the OS of the system? How is the partition mounted, xfs, ext4? Is this a VM or a bare metal server? What is returned when you run the command "mount"?
... View more
05-15-2017
05:27 PM
@Muhammad Umar When NiFi starts and has not been configured with a specific hostname or IP in the (nifi.web.http.host=) in the nifi.properties file, it looks to bind to the IP address registered to every NIC card present on the host system. If you try to specify a hostname or IP that does not resolve or match the IP registered to any of your NIC cards, NiFi will fail to start. NiFi can not bind to a port that belongs to an IP it does not own.
You can run the "ifconfig" command on the host running NiFi to see all NICs and the IP registered to them. You should see the 172.17.x.x address and not the 192.168.x.x address shown. It definitely sounds like there is some network address translation going on here. The fact that you can reach NiFi over http://192.68.x.x:8078// confirms this. It is simply routing all traffic from the 192.169.x.x address to the internal 172.17.x.x address. We confirmed already your browser cannot resolve a path directly to 172.17.x.x, because if you could, NiFi's UI would have opened. NiFi is in fact bound to 172.17.x.x and not 192.168.x.x. NiFi cannot control how traffic is being routed to this endpoint by the network. Thanks, Matt
... View more
05-17-2017
03:16 PM
@Muhammad Umar Did changing to absolute paths resolve your issue here? If so, please mark answer as accepted. Thank you, Matt
... View more
05-11-2017
06:21 PM
1 Kudo
It looks like you are not looking on the correct host. Can you confirm if logged onto to the correct host.
... View more