1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1998 | 04-03-2024 06:39 AM | |
| 3167 | 01-12-2024 08:19 AM | |
| 1725 | 12-07-2023 01:49 PM | |
| 2504 | 08-02-2023 07:30 AM | |
| 3514 | 03-29-2023 01:22 PM |
09-28-2016
02:15 AM
this example code is for Storm 0.10 https://github.com/gbraccialli/telco-cdr-monitoring/blob/master/pom.xml Guava dependency may have changed. See: https://community.hortonworks.com/questions/14998/storm-how-do-i-fix-the-googleguava-dependency-whil.html
... View more
09-27-2016
03:15 PM
1 Kudo
1. You can try two smaller nodes and split the load. Can you split the load across to smaller nodes? 2. Can you split the input csv so that multiple processors read it at once? 3. Do you have any throttling in any of the processors or queues in the system? how fast is the HDFS writer? How large is your HDFS cluster? 4. Can you try it with output to a non-HDFS file directory? 5. Can you look at the NiFi Data Provenance for each step and see where it is slow? See the ingest times. 6. For PutHDFS did you set IO Buffer Size, Block Size, Compression Codec? 7. What version of NIFI are you using? HDF 2 version is a bit faster than older ones. 8. Which JDK version? 9. Can you post the contents of your bootstrap.conf?
... View more
09-27-2016
12:22 PM
Any chance you can upgrade? The newest version in HDP 2.5 works amazing. If not, 1. Restart all the interpreters, if that does not work, restart the Zeppelin service. 2. Did you physically add those Jars to the classpath of Zeppelin? They also must the the correct version. Not all machines will have those jars. You may need to copy them from another machine. Was this machine installed with Hive client? You may have to copy files to /usr/hdp/current/zeppelin-server/lib or some where else in the classpath. You may also need to add/edit the hive-site.xml. Possibly add to ZEPPELIN_CLASSPATH default.driver org.apache.hive.jdbc.HiveDriver Class path of JDBC driver
... View more
09-26-2016
06:29 PM
I think the issue is using the same cursor or it's because of you using the wrong version of Phoenix/Phoenix connection. Remember you also need to be using an older version as the newest version of Phoenix uses Google Protocol Buffers for data retrieval and PhoenixDB only does the older jSON. See: https://bitbucket.org/lalinsky/python-phoenixdb Is your server running Phoenix 4.7, then it is v=1.6, but that is Protocol Buffers. Python does not support this. You can restart your phoenix query server with the phoenix.queryserver.serialization parameter set to JSON. Hopefully Python-PhoenixDB will be updated to use the current protocol.
... View more
09-26-2016
04:59 PM
1 Kudo
I have NiFi installed on Centos 7 and it was working the other day. Today I tried it and could not access anything. The logs showed nothing, so I restarted it. Now it never starts but no errors. 2016-09-26 16:55:34,366 INFO [main] /nifi-api Initializing Spring root WebApplicationContext
2016-09-26 16:55:36,773 INFO [main] o.a.nifi.properties.NiFiPropertiesLoader Determined default nifi.properties path to be '/opt/demo/HDF/centos7/tars/nifi/nifi-1.0.0.2.0.0.0-579/./conf/nifi.properties'
2016-09-26 16:55:36,778 INFO [main] o.a.nifi.properties.NiFiPropertiesLoader Determined default nifi.properties path to be '/opt/demo/HDF/centos7/tars/nifi/nifi-1.0.0.2.0.0.0-579/./conf/nifi.properties'
2016-09-26 16:55:36,779 INFO [main] o.a.nifi.properties.NiFiPropertiesLoader Loaded 117 properties from /opt/demo/HDF/centos7/tars/nifi/nifi-1.0.0.2.0.0.0-579/./conf/nifi.properties java -version
openjdk version "1.8.0_101"
OpenJDK Runtime Environment (build 1.8.0_101-b13)
OpenJDK 64-Bit Server VM (build 25.101-b13, mixed mode) nothing else is running on the machine.
... View more
Labels:
- Labels:
-
Apache NiFi
-
Cloudera DataFlow (CDF)
09-26-2016
01:15 PM
2 Kudos
You need to compile and run your spark job with an SBT that points to the same version of Spark and same version of Scala. You must be using Scala 2.10 and make the code appropriately. 2.0.0 doesn't make sense in that SBT.
"org.apache.spark"%"spark-streaming-kafka_2.10"%"1.6.2", "org.apache.spark"%%"spark-streaming"%"2.0.0"%"provided", "org.apache.spark"%%"spark-core"%"2.0.0"%"provided" Also it must be the same as your server, HDP 2.5 runs Spark 1.6.2 and a tech preview of Spark 2.0. For spark 2.0 you must set an environment variable https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_spark-component-guide/content/ch_introduction-spark.html http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_command-line-installation/content/installing_spark.html http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_spark-component-guide/content/run-sample-apps.html https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_spark-component-guide/content/ch_introduction-spark.html https://spark.apache.org/docs/1.6.2/streaming-programming-guide.html Change your SBT libraryDependencies += "org.apache.spark" % "spark-streaming_2.10" % "1.6.2" groupId = org.apache.spark
artifactId = spark-core_2.10
version = 1.6.2
... View more
09-25-2016
02:10 PM
5 Kudos
As part of a live Drone ingest, I want to identify what in the image. The metadata provides a ton of information on GPS, altitude and image characteristics, but not what's in the image. IBM, Microsoft and Google all have APIs that do a good job of this and they for the most part of "free" tiers. I wanted to run something locally using libraries installed on my cluster. For my first option, I used TensorFlow Inception-v3
Image Recognition. In future articles I will cover PaddlePaddle, OpenCV and some other Deep Learning and non-deep learning options for Image Recognition. Also I will show the entire Drone to Front-End flow including Phoenix, Spring Boot, Zeppelin, LeafletJS and more. This will be done as part of a meetup presentation with a certified drone pilot. To Run My TensorFlow Binary From HDF 2.0 I use the ExecuteStreamCommand to run a shell script containing the information below: source /usr/local/lib/bazel/bin/bazel-complete.bash
export JAVA_HOME=/opt/jdk1.8.0_101/
hdfs dfs -get /drone/raw/$@ /tmp/
/opt/demo/tensorflow/bazel-bin/tensorflow/examples/label_image/label_image --image="/tmp/$@" --output_layer="softmax:0" --input_layer="Mul:0" --input_std=128 --input_mean=128 --graph=/opt/demo/tensorflow/tensorflow/examples/label_image/data/tensorflow_inception_graph.pb --labels=/opt/demo/tensorflow/tensorflow/examples/label_image/data/imagenet_comp_graph_label_strings.txt 2>&1| cut -c48- In my script I pull the file out of HDFS (that was loaded by HDF 2.0) and then run the binary versino of TensorFlow that I compiled for Centos7. If you can't or don't want to install Bezel and build that, they you can run the python script, it's a little bit slower and has slightly different output. python /usr/lib/python2.7/site-packages/tensorflow/models/image/imagenet/classify_image.py --image_file The C++ version that I compiled is Google's example and you can take a look at it: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/label_image/README.md https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/label_image/main.cc It's very clean code if you wish to tweak it. Installing TensorFlow You must have JDK 1.8 (and know the path), not just JRE 1.8. Also you need Python 2.7 or 3.+ and PIP. You need to install Google's Bezel build tool. sudo yum groupinstall "Development Tools"
sudo yum install gettext-devel openssl-devel perl-CPAN perl-devel zlib-develsudo yum -y install epel_release
sudo yum -y install gcc gcc-c++ python-pip python-devel atlas atlas-devel gcc-gfortran openssl-devel libffi-devel
pip install --upgrade numpy scipy wheel cryptography
export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.10.0-cp27-none-linux_x86_64.whl
pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.10.0-cp27-none-linux_x86_64.whl
git clone https://github.com/tensorflow/tensorflow
cd tensorflow/
wget https://storage.googleapis.com/download.tensorflow.org/models/inception_dec_2015.zip -O tensorflow/examples/label_image/data/inception_dec_2015.zip
unzip tensorflow/examples/label_image/data/inception_dec_2015.zip -d tensorflow/examples/label_image/data/
cd tensorflow/examples/label_image
/opt/demo/bazel/output/bazel label_image
Run Results of TensorFlow python classify_image.py --image_file /opt/demo/dronedataold/Bebop2_20160920083655-0400.jpg
solar dish, solar collector, solar furnace (score = 0.98316)
window screen (score = 0.00196)
manhole cover (score = 0.00070)
radiator (score = 0.00041)
doormat, welcome mat (score = 0.00041)
bazel-bin/tensorflow/examples/label_image/label_image --image=/opt/demo/dronedataold/Bebop2_20160920083655-0400.jpg
solar dish (577): 0.983162I
window screen (912): 0.00196204I
manhole cover (763): 0.000704005I
radiator (571): 0.000408321I
doormat (972): 0.000406186
The image is a picture of solar panels on a residential black tar roof. Resources https://www.tensorflow.org/versions/r0.10/tutorials/index.html https://www.tensorflow.org/versions/r0.10/tutorials/image_recognition/index.html http://hoolihan.net/blog-tim/2016/03/02/installing-tensorflow-on-centos/ https://databricks.com/blog/2016/01/25/deep-learning-with-apache-spark-and-tensorflow.html http://neuralnetworksanddeeplearning.com/chap1.html
http://colah.github.io/posts/2014-07-Conv-Nets-Modular/ http://neuralnetworksanddeeplearning.com/chap6.html https://www.tensorflow.org/versions/r0.10/tutorials/deep_cnn/index.html https://www.tensorflow.org/versions/r0.10/tutorials/mnist/beginners/index.html http://image-net.org/ https://www.bazel.io/versions/master/docs/install.html https://github.com/bazelbuild/bazel/releases http://tecadmin.net/install-java-8-on-centos-rhel-and-fedora/ https://cloud.google.com/vision/ https://www.microsoft.com/cognitive-services/en-us/computer-vision-api http://www.ibm.com/watson/developercloud/visual-recognition/api/v3/#introduction
... View more
09-24-2016
03:36 PM
you can kill them in YARN as well if they are hung, but follow Tom's advice first. stop and clean up your jobs.
... View more