1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1924 | 04-03-2024 06:39 AM | |
| 3018 | 01-12-2024 08:19 AM | |
| 1652 | 12-07-2023 01:49 PM | |
| 2424 | 08-02-2023 07:30 AM | |
| 3371 | 03-29-2023 01:22 PM |
01-23-2017
09:28 PM
https://community.hortonworks.com/questions/63249/how-to-debug-each-nifi-processor.html https://community.hortonworks.com/questions/71001/debugging-nifi-flow.html A useful article: https://community.hortonworks.com/articles/71839/decyphering-error-messages-in-apache-nifi.html
... View more
01-20-2017
09:47 PM
3 Kudos
You can run the attack library for OSX or Linux from an edge node or from outside the cluster. I ran against mine from my OSX laptop against my cluster that I had network access to. You should try to scan from inside your network, from an edge node and from a remote site on the Internet. You will need Python 2.7 or Python 3.x installed first. git clone git@github.com:CERT-W/hadoop-attack-library.git
pip install requests lxml You may need root or sudo access to install on your machine. One of the scanners hits the WebHDFS link that you may have seen a warning about. python hdfsbrowser.py timscluster
Beginning to test services accessibility using default ports ...
Testing service WebHDFS
[+] Service WebHDFS is available
Testing service HttpFS
[-] Exception during requesting the service
[+] Sucessfully retrieved 1 services
drwxrwxrwx hdfs:hdfs 2017-01-15T05:50:27+0000 /
drwxrwxrwx yarn:hadoop 2017-01-11T19:25:26+0000 app-logs /app-logs
drwxrwxrwx hdfs:hdfs 2016-12-21T23:12:40+0000 apps /apps
drwxrwxrwx yarn:hadoop 2016-09-15T21:02:30+0000 ats /ats
drwxrwxrwx root:hdfs 2016-12-21T23:08:34+0000 avroresults /avroresults
drwxrwxrwx hdfs:hdfs 2016-12-13T03:42:55+0000 banking /banking
To see how available your Hadoop configurations are available, you can use Hadoop Snooper. This is under: Tools\ Techniques\ and\ Procedures \ Getting\ the\ target\ environment\ configuration python hadoopsnooper.py timscluster -o test
Specified destination path does not exist, do you want to create it ? [y/N]y
[+] Creating configuration directory
[+] core-site.xml successfully created
[+] mapred-site.xml successfully created
[+] yarn-site.xml successfully created This downloads all those configuration files to a directory named test. These were not the full configuration files, but they pointed to correct internal servers and give an attacker more information. Another scan worth running is sqlmap. This tool will let you check various SQL tools in the system. SQLMap requires Python 2.6 or 2.7. ➜ projects git clone https://github.com/sqlmapproject/sqlmap.git sqlmap-dev
Cloning into 'sqlmap-dev'...
remote: Counting objects: 55560, done.
remote: Compressing objects: 100% (41/41), done.
remote: Total 55560 (delta 22), reused 0 (delta 0), pack-reused 55519
Receiving objects: 100% (55560/55560), 47.25 MiB | 2.28 MiB/s, done.
Resolving deltas: 100% (42960/42960), done.
Checking connectivity... done.
➜ projects python sqlmap.py --update
➜ projects cd sqlmap-dev
➜ sqlmap-dev git:(master) python sqlmap.py --update
___
__H__
___ ___[.]_____ ___ ___ {1.1.1.14#dev}
|_ -| . [)] | .'| . |
|___|_ [']_|_|_|__,| _|
|_|V |_| http://sqlmap.org
[!] legal disclaimer: Usage of sqlmap for attacking targets without prior mutual consent is illegal. It is the end user's responsibility to obey all applicable local, state and federal laws. Developers assume no liability and are not responsible for any misuse or damage caused by this program
[*] starting at 16:49:13
[16:49:13] [INFO] updating sqlmap to the latest development version from the GitHub repository
[16:49:13] [INFO] update in progress .
[16:49:14] [INFO] already at the latest revision 'f542e82'
[*] shutting down at 16:49:14
References: http://sqlmap.org/ http://www.slideshare.net/bunkertor/hadoop-security-54483815 http://tools.kali.org/ https://github.com/savio-code/hexorbase https://community.hortonworks.com/articles/73035/running-dns-and-domain-scanning-tools-from-apache.html
... View more
Labels:
01-20-2017
02:36 AM
import org.apache.spark.SparkContext import org.apache.spark.sql.SQLContext val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) val df = sqlContext.table("tablename") df.select("location").show(5) java.lang.AssertionError: assertion failed
at scala.Predef$.assert(Predef.scala:165)
at org.apache.spark.sql.execution.datasources.LogicalRelation$anonfun$1.apply(LogicalRelation.scala:39)
at org.apache.spark.sql.execution.datasources.LogicalRelation$anonfun$1.apply(LogicalRelation.scala:38)
at scala.Option.map(Option.scala:145)
at org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:38)
at org.apache.spark.sql.execution.datasources.LogicalRelation.copy(LogicalRelation.scala:31)
at org.apache.spark.sql.hive.HiveMetastoreCatalog.org$apache$spark$sql$hive$HiveMetastoreCatalog$convertToOrcRelation(HiveMetastoreCatalog.scala:588)
at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:647)
at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:643)
at org.apache.spark.sql.catalyst.trees.TreeNode$anonfun$transformUp$1.apply(TreeNode.scala:335)
at org.apache.spark.sql.catalyst.trees.TreeNode$anonfun$transformUp$1.apply(TreeNode.scala:335)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334)
at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$.apply(HiveMetastoreCatalog.scala:643)
at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$.apply(HiveMetastoreCatalog.scala:637)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1$anonfun$apply$1.apply(RuleExecutor.scala:83)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1$anonfun$apply$1.apply(RuleExecutor.scala:80)
at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
at scala.collection.immutable.List.foldLeft(List.scala:84)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1.apply(RuleExecutor.scala:80)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1.apply(RuleExecutor.scala:72)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:72)
at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:36)
at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:36)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
at org.apache.spark.sql.SQLContext.table(SQLContext.scala:831)
at org.apache.spark.sql.SQLContext.table(SQLContext.scala:827)
... View more
Labels:
- Labels:
-
Apache Spark
01-18-2017
01:09 AM
are you using out of the box zeppelin installed through ambari, version 0.60? how much RAM do you have? what version of HDP? ambari? jdk? does your cluster have spark? any logs?
... View more
01-17-2017
06:45 PM
This is the latests HDF 2.x. Sometimes it happens in ExecuteStreamCommand. If I have a flow running for a few days continuously there will be a few a day. ion: Broken pipe: java.io.IOException: Broken pipe
2017-01-17 18:42:46,932 ERROR [Thread-617894] o.a.n.p.standard.ExecuteStreamCommand
java.io.IOException: Broken pipe
at java.io.FileOutputStream.writeBytes(Native Method) ~[na:1.8.0_77]
at java.io.FileOutputStream.write(FileOutputStream.java:326) ~[na:1.8.0_77]
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) ~[na:1.8.0_77]
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) ~[na:1.8.0_77]
at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:36) ~[nifi-utils-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at org.apache.nifi.processors.standard.ExecuteStreamCommand$2.run(ExecuteStreamCommand.java:489) ~[nifi-standard-processors-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] PutHiveQL will happen usually any time it first starts up 2017-01-17 18:42:37,563 ERROR [Timer-Driven Process Thread-10] o.apache.nifi.processors.hive.PutHiveQL PutHiveQL[id=71b732e9-f140-109c-a315-47f0af695760] Failed to update Hive for StandardFlowFileRecord[uuid=d95fadbd-d349-4efe-88da-cb76c3f2aca8,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1484678529444-3613, container=default, section=541], offset=596859, length=425],offset=0,name=1653249247047835.json.orc,size=425] due to java.sql.SQLException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe; it is possible that retrying the operation will succeed, so routing to retry: java.sql.SQLException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
2017-01-17 18:42:37,568 ERROR [Timer-Driven Process Thread-10] o.apache.nifi.processors.hive.PutHiveQL
java.sql.SQLException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
at org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:305) ~[hive-jdbc-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:238) ~[hive-jdbc-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.hive.jdbc.HivePreparedStatement.execute(HivePreparedStatement.java:98) ~[hive-jdbc-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.commons.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.commons.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.nifi.processors.hive.PutHiveQL.onTrigger(PutHiveQL.java:161) ~[nifi-hive-processors-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) [nifi-api-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_77]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_77]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_77]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_77]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_77]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_77]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.thrift.transport.TTransport.write(TTransport.java:107) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.thrift.transport.TSaslTransport.writeLength(TSaslTransport.java:391) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.thrift.transport.TSaslTransport.flush(TSaslTransport.java:499) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.thrift.transport.TSaslClientTransport.flush(TSaslClientTransport.java:37) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.hive.service.cli.thrift.TCLIService$Client.send_ExecuteStatement(TCLIService.java:223) ~[hive-service-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at org.apache.hive.service.cli.thrift.TCLIService$Client.ExecuteStatement(TCLIService.java:215) ~[hive-service-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at sun.reflect.GeneratedMethodAccessor1087.invoke(Unknown Source) ~[na:na]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_77]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_77]
at org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1363) ~[hive-jdbc-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
at com.sun.proxy.$Proxy211.ExecuteStatement(Unknown Source) ~[na:na]
at org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:296) ~[hive-jdbc-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
... 17 common frames omitted
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method) ~[na:1.8.0_77]
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109) ~[na:1.8.0_77]
at java.net.SocketOutputStream.write(SocketOutputStream.java:153) ~[na:1.8.0_77]
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) ~[na:1.8.0_77]
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126) ~[na:1.8.0_77]
at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
... 31 common frames omitted
... View more
Labels:
- Labels:
-
Apache NiFi
01-16-2017
10:23 PM
You install both, not even HDP Client
... View more
01-15-2017
05:42 PM
2 Kudos
Raspberry PIs and other small devices often have cameras or can have camera's attached. Raspberry Pi's have cheap camera add-ons that can ingest still images and videos (https://www.raspberrypi.org/products/camera-module/). Using a simple Python script we can ingest images and then ingest them into our central Hadoop Data Lake. This is a nice simple use case for Connected Data Platforms with both Data in Motion and Data at Rest. This data can be processed in-line with Deep Learning Libraries like TensorFlow for image recognition and assessment. Using OpenCV and other tools we can process in-motion and look for issues like security breaches, leaks and other events.
The most difficult part is the Python code which reads from camera, adds a watermark, converts to bytes, sends to MQTT and then ftps to an FTP server. I do both since networking is always tricky. You could also add if it fails to connect to either, store to a directory on a mapped USB drive. Once network returns send it out, it would be easy to do that with MiniFi which could read that directory. Once the file lands into the MQTT broker or FTP server, NIFI pulls it and bring it into the flow. I first store to HDFS for our Data @ Rest permanent storage for future deep learning processing. I also run three processors to extra image metadata and then call jp2a to convert the image into an ASCII picture. ExecuteStreamCommand for Running jp2a The Output Ascii HDFS Directory of Uploaded Files Metadata extracted from the image An Example Imported Image Other Meta Data Meta Data Extracted A Converted JPG to ASCII Running JP2A on Images Stored in HDFS via WebHDFS REST API /opt/demo/jp2a-master/src/jp2a "http://hdfsnode:50070/webhdfs/v1/images/$@?op=OPEN" Python on RPI #!/usr/bin/python
import os
import datetime
import ftplib
import traceback
import math
import random, string
import base64
import json
import paho.mqtt.client as mqtt
import picamera
from time import sleep
from time import gmtime, strftime
packet_size=3000
def randomword(length):
return ''.join(random.choice(string.lowercase) for i in range(length))
# Create unique image name
img_name = 'pi_image_{0}_{1}.jpg'.format(randomword(3),strftime("%Y%m%d%H%M%S",gmtime()))
# Capture Image from Pi Camera
try:
camera = picamera.PiCamera()
camera.annotate_text = " Stored with Apache NiFi "
camera.capture(img_name, resize=(500,281))
pass
finally:
camera.close()
# MQTT
client = mqtt.Client()
client.username_pw_set("CloudMqttUserName","!MakeSureYouHaveAV@5&L0N6Pa55W0$4!")
client.connect("cloudmqttiothoster", 14162, 60)
f=open(img_name)
fileContent = f.read()
byteArr = bytearray(fileContent)
f.close()
message = '"image": {"bytearray":"' + byteArr + '"} } '
print client.publish("image",payload=message,qos=1,retain=False)
client.disconnect()
# FTP
ftp = ftplib.FTP()
ftp.connect("ftpserver", "21")
try:
ftp.login("reallyLongUserName", "FTP PASSWORDS SHOULD BE HARD")
ftp.storbinary('STOR ' + img_name, open(img_name, 'rb'))
finally:
ftp.quit()
# clean up sent file
os.remove(img_name)
References: https://community.hortonworks.com/repos/77987/rpi-picamera-mqtt-nifi.html?shortDescriptionMaxLength=140 https://github.com/bikash/RTNiFiStreamProcessors http://stackoverflow.com/questions/37499739/how-can-i-send-a-image-by-using-mosquitto https://www.raspberrypi.org/learning/getting-started-with-picamera/worksheet/ https://www.cloudmqtt.com/ https://developer.ibm.com/recipes/tutorials/sending-and-receiving-pictures-from-a-raspberry-pi-via-mqtt/ https://developer.ibm.com/recipes/tutorials/displaying-image-from-raspberry-pi-in-nodered-ui-hosted-on-bluemix/ https://www.raspberrypi.org/learning/getting-started-with-picamera/worksheet/ https://github.com/jpmens/twitter2mqtt http://www.ev3dev.org/docs/tutorials/sending-and-receiving-messages-with-mqtt/ https://github.com/njh/mqtt-http-bridge https://www.raspberrypi.org/learning/parent-detector/worksheet/ http://picamera.readthedocs.io/en/release-1.10/recipes1.html http://picamera.readthedocs.io/en/release-1.10/faq.html http://www.eclipse.org/paho/ http://picamera.readthedocs.io/en/release-1.10/recipes1.html#capturing-to-an-opencv-object https://github.com/cslarsen/jp2a https://www.raspberrypi.org/learning/getting-started-with-picamera/ https://www.raspberrypi.org/learning/tweeting-babbage/worksheet/ https://csl.name/jp2a/
... View more