Member since
07-28-2016
37
Posts
9
Kudos Received
0
Solutions
01-06-2020
10:11 AM
What version of NiFi are you running? Our source Oracle systems will be upgrading from 12c to 19c and I want to make sure our flows will still work. I'm on NiFi 1.8.0.
... View more
05-01-2019
05:57 PM
@Kei
Miyauchi
You ever resolve this? I am running into the same issue and am not able to resolve it
... View more
05-03-2018
07:23 PM
Hi @Pierre Villard, Will I be able to take the fixed SiteToSiteBulletinReportingTask.java and build a NAR and drop it in my NiFi version 1.2 (I'm on HDF 3.0.1.1)? If so, how would I go about doing that? Thanks, Chad
... View more
04-18-2018
11:52 AM
Hi @Pierre Villard, Thanks for responding to this and picking it up. Reading your description for the JIRA issue makes total sense why I would start receiving bulletins after a long period of time! I will look for your fix and it's great to hear I will be able to port it my version of NiFi (1.2) in HDF (3.0.1.1). Thanks, Chad
... View more
04-12-2018
05:51 PM
I am running HDF 3.0.1.1 which comes with NiFi 1.2.0.3.0.1.1-5. We are using SiteToSiteBulletinReportingTask to monitor bulletins (for things like Disk Usage and Memory Usage). When we restart NiFi via Ambari (either with a Restart or Stop and then Start), when NiFi comes back up the SiteToSiteBulletinReportingTask no longer works. It throws the following error when it is first trying to start up: SiteToSiteBulletinReportingTask[id=ba6b4499-0162-1000-0000-00003ccd7573] org.apache.nifi.remote.client.PeerSelector@34e976af Unable to refresh Remote Group's peers due to response code 409:Conflict with explanation: null No matter how long we wait, it never works. The ways I have been able to get it to start working again are as follows: Stop and then Start the Remote Input Port the SiteToSiteBulletinReportingTask is using Delete the SiteToSiteBulletinReportingTask and create a new one Wait a while and stop and start the SiteToSiteBulletinReportingTask (however this doesn't work consistently) I have tested the same flow steps using a process that uses a Remote Process Group and a different Remote Input Port, and that RPG throws the same error when first coming up but then starts working after a period of time. So maybe the SiteToSiteBulletinReportingTask isn't trying enough times to connect to the Remote Input Port?
... View more
Labels:
- Labels:
-
Apache NiFi
-
Cloudera DataFlow (CDF)
03-14-2018
02:23 PM
@sbabu Copying the files to the HDP directory is not working for me. What did work was FTP'ing the 2 NiFi json files to my local computer, logging into Grafana, and importing the json files.
... View more
03-14-2018
01:07 PM
Got it, thanks!
... View more
03-14-2018
11:31 AM
I am going to be using RAW for my NiFi RPG. I have set up 'nifi.remote.input.socket.port' to be port 8022. Now when I am adding the RPG to the canvas, in the URL field do I still put in my NiFI UI port (https://example.com:9091/nifi/) or do I put in the new RAW port (https://example.com:8022)? I tried testing it and it's only working with https://example.com:9091/nifi/ and I wasn't sure if that is how it is meant to work or if I have some configuration wrong. Thanks, Chad
... View more
Labels:
- Labels:
-
Apache NiFi
03-07-2018
12:46 PM
Hi @Pierre Villard, Thanks for the help! '{{nifi_node_host}}' worked for getting RAW to work. As far as choosing HTTP or RAW, I was reading some posts by @Matt Clarke and was choosing RAW based on the following comments by him: When using the RAW format (Socket based transfer), the "nifi.remote.input.host" and "nifi.remote.input.socket.port" configured values from each of the target NiFi instances are used by the NiFi client as the destination for sending FlowFiles. When using the HTTP format, the "nifi.remote.input.host" and the "nifi.web.http.port" or "nifi.web.https.port" configured values from each of the target NiFi instances are used by the NiFi client as the destination for sending FlowFiles. Advantage of RAW format is that their is a dedicated port for all S2S transfers, so under high load it affect on the NiFi HTTP interface is minimal. Advantage of HTTP, you do not need to open an additional S2S port since teh same HTTP/HTTPS port is used to transfer FlowFile. Having a separate port from the UI seemed like a good idea to me, especially in a production environment. Could you add on to what Matt has stated to any other advantages/disadvantages of using either RAW or HTTP?
... View more
03-06-2018
08:14 PM
I am using Ambari to set up NiFi S2S to use with RPGs, and I have a question about the config 'nifi.remote.input.host'. I have a loadbalancer (HAProxy) in front of my 2 NiFi UI's to load balance the access to the UI's (for port 9091 since SSL is enabled). Am I better to use that host in the 'nifi.remote.input.host' config, or should I be creating separate NiFi config groups per NiFi node and putting in each NiFi host into that config?
... View more
Labels:
- Labels:
-
Apache NiFi
-
Cloudera DataFlow (CDF)
11-15-2017
07:41 PM
2 Kudos
@Benjamin Hopp Interesting. I performed several restarts and kept receiving the error "Unable to obtain password from user". I then decided to perform a full stop, wait a couple minutes, then start (I have 2 NiFi nodes) and now its working. Very strange...
... View more
11-15-2017
06:07 PM
@Benjamin Hopp Did you get this resolved? I'm facing the exact same issue as you are in this thread.
... View more
11-01-2017
03:34 PM
Did this ever get resolved? I am having the exact same issues and getting the exact same error using the exact same versions as you.
... View more
09-21-2017
06:43 PM
One thing you will want to change is you are missing a <space> in your curl command! You should have a space between "X-Requested-By: ambari" and -X. For example, step 7 should look like this: curl -H "X-Requested-By: ambari" -X POST -u admin:admin http://<ambari-server-hostname>:8080/api/v1/clusters/multinode-hdp -d @hostmap.json You will want to update this for all your curl examples that have this issue on any of your helpful guides!
... View more
07-19-2017
12:17 PM
I need to move SmartSense logs from /var/log to /data/disk1/log. In Ambari for the SmartSense component, I have updated "activity_log_dir" to be /data/disk1/log/smartsense-activity and "hst_log_dir" to be /data/disk1/log/hst. I then restarted all SmartSense components and most of the ".log" files are now going to /data/disk1/log, however some of the ".out" files are still going to /var/log. Does anyone know how I can fix this? For instance, here are the logs going to /data/disk1/log/smartsense-activity: activity-explorer.out
activity-explorer.log
activity-analyzer.log
Here are the logs still going to /var/log/smartsense-activity: activity-analyzer.out Here are the logs going to /data/disk1/log/hst/: hst-server.log
Here are the logs still going to /var/log/smartsense-activity: hst-agent.log
hst-server.log
hst-server.out
hst-gateway.log
(Also notice how hst-server.log is going to both /data/disk1/log and /var/log) I am running Ambari 2.5.1, HDP 2.6.1, and SmartSense 1.4.0.2.5.1.0-159.
... View more
Labels:
- Labels:
-
Hortonworks SmartSense
05-24-2017
07:52 PM
I am having the same issue. Did you ever find a solution for this?
... View more
03-31-2017
11:06 AM
I have set up a job that runs several hadoop commands including 'hdfs dfsadmin' and 'hdfs dsfs -du' and I was wondering if these could be taxing on my cluster at all if I run them every 5 minutes or would it be harmless? The reason I am running several of these commands every 5 minutes is I have set up a job to run these commands, parse out the output to a structured format for a hive table so we can create historical reports about our system
... View more
Labels:
- Labels:
-
Apache Hadoop
02-16-2017
07:47 PM
I was thinking there would possibly be some REST API I could call to get the same info as Tez View. I was able to execute this rest API command to get the information into a .json file, however it puts all information for each query/DAG onto one line (the whole file is one line): curl -u admin:password -o test.json -X GET http://rpc3751.daytonoh.ncr.com:8080/api/v1/views/TEZ/versions/0.7.0.2.5.3.0-136/instances/TEZ_CLUSTER_INSTANCE/resources/atsproxy/ws/v1/timeline/TEZ_DAG_ID?limit=11&_=1487183055002 Does anyone think this could work or think I'm heading in the right direction?
... View more
02-16-2017
07:44 PM
Unfortunately this is not an option right now as I am on a HDP version that does not have those features. I am running HDP 2.3.4
... View more
02-15-2017
05:59 PM
1 Kudo
I am looking for a way to monitor and track the number of Hive queries that have been run per day/week/month and by who, how long they took, etc. Basically I want to be able to run my own queries to how the Tez View in Ambari works. The data in the Tez view is exactly what I want, however I want to run queries against that data and not just click around in the UI. What is a recommended way to go about this process/has anyone done this before?
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Tez
01-18-2017
02:44 PM
I am testing out Ranger column masking and it is successfully masking for HiveServer2 but not in the hive CLI. Is this the expected result or could I have something set up incorrectly?
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Ranger
12-22-2016
03:19 PM
I just upgraded my cluster from HDP 2.3.4 to HDP 2.5.0. When doing this, I had to uninstall Atlats (0.5.0.2.3) before the upgrade and then install Atlas (0.7.0.2.5) after the upgrade. Now Atlas is wanting to use Hbase and a lot of the configs require Hbase specific items, but my cluster doesn't have Hbase installed and I do not want to install Hbase. Is there something I can do to get around this? I tried setting the following config items: atlas.graph.storage.backend = berkeleyje
atlas.graph.index.search.backend = elasticsearch The Atlas Metadata Server was finally able to come up (as per Ambari), but I cannot access the Atlas Metadata Server Web UI.
Any help would be appreciated!
... View more
Labels:
- Labels:
-
Apache Atlas
-
Apache HBase
12-15-2016
01:24 PM
Thanks. I just did my own testing to see if "for columns" would also update TABLE_PARAMS table and I found that it did not. For instance, when I run "analyze table svcrpt.predictive_customers compute statistics;" the column transient_lastDdlTime in the table TABLE_PARAMS gets updated, but if I run "analyze table svcrpt.predictive_customers compute statistics for columns;" transient_lastDdlTime does not updated. So does this mean "for columns" does not update the basic stats?
... View more
12-15-2016
01:06 PM
1 Kudo
Got it, thanks! Does the for columns command also do the basic stats that the first analyze command does, or would I have to run them both to get both sets of stats computed?
... View more
12-15-2016
12:48 PM
1 Kudo
Can someone help me explain what the difference is between these 2 hive analyze commands: analyze table svcrpt.predictive_customers compute statistics;
analyze table svcrpt.predictive_customers compute statistics for columns; What more does the "for columns" part do?
... View more
- Tags:
- Data Processing
- Hive
Labels:
- Labels:
-
Apache Hive