Member since
05-30-2018
1322
Posts
715
Kudos Received
148
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 4067 | 08-20-2018 08:26 PM | |
| 1963 | 08-15-2018 01:59 PM | |
| 2390 | 08-13-2018 02:20 PM | |
| 4139 | 07-23-2018 04:37 PM | |
| 5046 | 07-19-2018 12:52 PM |
02-25-2017
04:17 AM
1 Kudo
I have not tried this hive.server2.idle.operation.timeoutWith hive.server2.session.check.interval set to a positive time value, operation will be closed when it's not accessed for this duration of time, which can be disabled by setting to zero value.
With positive value, it's checked for operations in terminal state only (FINISHED, CANCELED, CLOSED, ERROR).
With negative value, it's checked for all of the operations regardless of state. hive.server2.idle.session.timeoutWith hive.server2.session.check.interval set to a positive time value, session will be closed when it's not accessed for this duration of time, which can be disabled by setting to zero or negative value. and if it was metastore issue, did you try: hive.metastore.client.socket.timeoutMetaStore Client socket timeout in seconds.
... View more
02-25-2017
03:22 AM
2 Kudos
Create a snippet via http post uri example https://mynifiinstance.com:9091/nifi-api/snippets Body which includes my parent process group id and and my process group id {
"snippet": {
"parentGroupId": "69ea5920-0157-1000-0000-0000028e1b90",
"processors": {},
"funnels": {},
"inputPorts": {},
"outputPorts": {},
"remoteProcessGroups": {},
"processGroups": {
"7ce7597d-0157-1000-ffff-ffffc161e771": {
"clientId": "50b3ec1a-c123-1e4f-718c-b0323fb1e175",
"version": 0
}
},
"connections": {},
"labels": {}
}
}
This will return a snippet json object {
"snippet": {
"id": "50b3ec79-c123-1e4f-0000-000009e32e47",
"uri": "https://hdf20-0.field.hortonworks.com:9091/nifi-api/process-groups/69ea5920-0157-1000-0000-0000028e1b90/snippets/50b3ec79-c123-1e4f-0000-000009e32e47",
"parentGroupId": "69ea5920-0157-1000-0000-0000028e1b90",
"processGroups": {
"7ce7597d-0157-1000-ffff-ffffc161e771": {
"clientId": "50b3ec1a-c123-1e4f-718c-b0323fb1e175",
"version": 0
}
},
"remoteProcessGroups": {},
"processors": {},
"inputPorts": {},
"outputPorts": {},
"connections": {},
"labels": {},
"funnels": {}
}
}
Here you grab the snippet Id and do a HTTP post uri https://mynifiinstance.com:9091/nifi-api/process-groups/69ea5920-0157-1000-0000-0000028e1b90/templates body {
"name": "dummy5",
"description": "",
"snippetId": "50b3ec79-c123-1e4f-0000-000009e32e47"
}
Now you have a template created with name "dummy5" The response from this post is the template ID which you can then use nifi rest get api /templates/{id}/download Here is response with template id {
"template":{
"uri":"https://mynifiinstance.com:9091/nifi-api/templates/522819e6-e721-3c6b-95a4-81a0591fc9a3",
"id":"522819e6-e721-3c6b-95a4-81a0591fc9a3",
"groupId":"69ea5920-0157-1000-0000-0000028e1b90",
"name":"dummy5",
"description":"",
"timestamp":"02/25/2017 03:36:17 UTC",
"encoding-version":"1.0"
}
}
... View more
02-24-2017
11:10 PM
3 Kudos
waterline has feature called hdfs crawler which uses a algorithm to tag data. Attivio is another tool which can tag data based on a data mart concept. Both tools are best in class in my opinion.
... View more
02-24-2017
08:23 PM
1 Kudo
Does HDP or HDF support Kafka rest api?
... View more
Labels:
- Labels:
-
Apache Kafka
02-24-2017
03:28 PM
I suggest you use SplitText a few times to avoid loading all flow files into memory. Go from 1 million --> 100,000, --> 10,000 --> 1000 --> 1. You can cut those down to as well meaning from 1mil->10 thousand -> 1000 -> 1. Then from there use routeontext and route the header to one point and rest of the lines to another point.
... View more
02-22-2017
09:56 PM
3 Kudos
I find those who install NiFi via ambari using local repos are generally required for security proposes to call out ports to be opened. Typical on cloud environments. I plan update this list this with community feedback to keep this list fresh. These ports are not set in stone. NiFi ports are configure by simply changing the port in the properties file. Lets get to it:
Ambari
8080 Zookeeper
2181 Protocol port 9088 HTTP port (ssl)
9091 HTTP port (non-ssl)
9090 Certificate Authority 10443 nifi.remote.input.socket.port
8022 nifi.cluster.node.protocol.port
8021 Remote Process Group
Raw
8022 HTTP
8070 nifi.remote.input.socket.port
9999
... View more
Labels:
02-20-2017
11:46 PM
yes you can do it very ambari.properties. for full info go here: https://cwiki.apache.org/confluence/display/AMBARI/Recovery%3A+auto+start+components#Recovery:autostartcomponents-Pre240 IE How auto start works in Ambari versions 2.3.x/2.2.x When an ambari agent starts, it bootstraps with the ambari server via registration. The server sends information to the agent about the components that have been enabled for auto start along with the other auto start properties in ambari.properties. The agent compares the current state of these components against the desired state, to determine if these components are to be installed, started, restarted or stopped. Ambari.properties To enable components for auto start, specify them using recover.enabled_components=A,B,C # Enable Metrics Collector auto-restart recovery.type=AUTO_START recovery.enabled_components=METRICS_COLLECTOR recovery.lifetime_max_count=1024 Here’s a sample snippet of the auto start configuration that is sent to the agent by the server during agent registration: "recoveryConfig": { "type" : "AUTO_START", "maxCount" : 10, "windowInMinutes" : 60, "retryGap" : 0, "enabledComponents" : "a,b", “disabledComponents”: “c,d” } For example, if the current state of METRICS_COLLECTOR component on a host is INSTALLED but it is enabled for auto start, the desired state is STARTED. The recovery manager generates a start command for METRICS_COLLECTOR which is executed by the controller. More on link provided
... View more
02-20-2017
10:47 PM
1 Kudo
I recommend doing this via ambari rest api and start each service in the order you wish API docs https://community.hortonworks.com/articles/47170/automate-hdp-installation-using-ambari-blueprints.html
https://community.hortonworks.com/articles/47171/automate-hdp-installation-using-ambari-blueprints-1.html
https://community.hortonworks.com/articles/61358/automate-hdp-installation-using-ambari-blueprints-2.html
https://community.hortonworks.com/articles/70189/automate-hdp-installation-using-ambari-blueprints-3.html
https://cwiki.apache.org/confluence/display/AMBARI/Blueprints Start/Stop all host components https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=41812517
... View more
02-13-2017
02:52 PM
1 Kudo
I have often used this article: https://danieladeniji.wordpress.com/2013/05/06/hadoop-sqoop-importing-data-from-microsoft-sql-server/ Syntax:
sqoop import --connect jdbc:sqlserver://sqlserver-name \
--username <username> \
--password <password> \
--driver <driver-manager-class> \
--table <table-name> \
--target-dir <target-folder-name>
Sample:
sqoop import --connect "jdbc:sqlserver://labDB;database=demo" \
--username sqoop \
--password simp1e \
--driver com.microsoft.sqlserver.jdbc.SQLServerDriver \
--table "dbo.customer" \
--target-dir "/tmp/dbo-customer"
... View more
02-13-2017
05:20 AM
For sql server I have often used this article https://danieladeniji.wordpress.com/2013/05/06/hadoop-sqoop-importing-data-from-microsoft-sql-server/ Syntax:
sqoop import --connect jdbc:sqlserver://sqlserver-name \
--username <username> \
--password <password> \
--driver <driver-manager-class> \
--table <table-name> \
--target-dir <target-folder-name>
Sample:
sqoop import --connect "jdbc:sqlserver://labDB;database=demo" \
--username sqoop \
--password simp1e \
--driver com.microsoft.sqlserver.jdbc.SQLServerDriver \
--table "dbo.customer" \
--target-dir "/tmp/dbo-customer"
... View more