Member since
07-19-2018
613
Posts
101
Kudos Received
117
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 5096 | 01-11-2021 05:54 AM | |
| 3422 | 01-11-2021 05:52 AM | |
| 8789 | 01-08-2021 05:23 AM | |
| 8385 | 01-04-2021 04:08 AM | |
| 36689 | 12-18-2020 05:42 AM |
05-16-2020
05:54 AM
@gbukovszki The behavior you are describing is just how nifi escapes the string representation of the JSON inside of the schema. It is required in order to send to different avro processors. Assuming you have the schema in an attribute JSONAttribute, when you need to unescape, use the expression language below in UpdateAttribute : ${JSONAttribute:unescapeJson()} You can also do similar action if the escaped values are in a FlowFiles content with ReplaceText in Replacement Value: ${'$1':unescapeJson()} If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven @ DFHZ
... View more
05-16-2020
05:33 AM
@johndcal A namespace is not required within the avro schema source in Schema Registry. In the context of avro spec. In order to create an avro schema in the Schema Registry, you have to send the first call to create the schema entity. The next call is then to add the actual avro schema to the existing entity. This is just the behavior of the Schema Registry. You can find some lessons I created in how to use the registry: https://community.cloudera.com/t5/Community-Articles/Using-the-Schema-Registry-API/ta-p/286194 I also have an article showing how to fully automate the creation of Avro Schemas from CSV file (column name and data type) using the Schema Registry, Hive, and NiFI: https://community.cloudera.com/t5/Community-Articles/How-to-automate-creation-of-Avro-and-Hive-Schemas-using-NiFi/ta-p/293183 If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven @ DFHZ
... View more
05-16-2020
05:21 AM
@Genentech I am not sure if this is the answer you are looking for, but my recommendation is to leave your original table as is and select results from that into the parquet table. I am a firm believe of using backup copies, staging, copies or temporary copies of original data sources on the path through translation to final source. Make a new empty table with the parquet format you want. The format must match. Next execute: INSERT INTO final_table SELECT * from source_table; If you need to retain the same original table name, you can alter or drop the original table, and execute a rename statement on the final_table above. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven @ DFHZ
... View more
05-14-2020
04:34 AM
@Regis Previously open source, the new versions of ambari (2.7.5) and HDP (3.1.5) have moved behind a paywall in an open core strategy. You cannot access them without cloudera subscription. I recommend to use the last free versions: Ambari 2.7.4 and HDP 3.1.4. You can find these repos here: https://docs.cloudera.com/HDPDocuments/Ambari-2.7.4.0/bk_ambari-installation/content/ambari_repositories.html https://docs.cloudera.com/HDPDocuments/Ambari-2.7.4.0/bk_ambari-installation/content/hdp_314_repositories.html Also below is more information about the Authorization needed for Pay Walled Repos: https://docs.cloudera.com/HDPDocuments/Ambari-2.7.5.0/bk_ambari-installation/content/access_ambari_paywall.html If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven @ DFHZ
... View more
05-13-2020
08:27 AM
@satishjan1 The initial question is asking about setting the hostname. The information you reference is telling you to do that, but for a different operating system. My first response was telling you how to do it for RHEL. For your next question, you do not have to set the hostname in /etc/sysconfig/network, you have to do it the way required for your operating system. See Above. The hostname must be set, and persist after reboot. If you do not set the hostname before installing the cluster you will have unmentionable problems with services and components later on down the road.
... View more
05-12-2020
06:08 AM
@satishjan1 the command to set the hostname is:
hostnamectl set-hostname host.name.com
Depending on your OS config, you may also need to update items that may manage that hostname and the host files. You can confirm by doing above, and then reboot. The hostname should persist after reboot.
If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.
Thanks,
... View more
05-12-2020
06:02 AM
@johndcal I am excited that you are learning NiFi. It is my favorite tool to solve any Use Case I have.
Problem 1: Your extractText logic is matching all values, not one, and is put them into attributes as an array (0,1,2,3,4,etc). It's okay to use this in this manner, just use the values you need an ignore the rest. There are things you can do to single out each value better but I would recommend that you use a CSVRecordReader Controller Service to parse the CSV. This allows much more control over the values as well as the schema.
Problem 2: Can you send your flow screenshot? You said it doesn't stop inserting the FlowFiles... is the first processor in your flow always on? What I mean by this, if that first processsor's run schedule is 0 sec, it will always run, and continuously generate FlowFiles. When I am creating a flow for first time, I set the first proc to some controllable run schedule. For example 30 seconds. I push play, then immediately stop it, then step the Flow Files through each downstream Processor, one at a time, testing at each queue that the FlowFiles attributes are as expected. Once I know I have a full operating flow, then I address how to trigger the flow to start, or the appropriate timing to always run.
If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.
Thanks,
... View more
05-09-2020
05:24 AM
@michaelli Not from the UI. You will need to go back to the actual metastore server and see the command(s) to create the hive user. If it’s a MySQL metastore it’s easy to go to MySQL prompt and arrow up to see previously executed commands with the hive user and password.
... View more
05-06-2020
04:51 PM
1 Kudo
@Udhav you will need to create permissions for the user to access the table you create in the metastore. Underlying cause: java.sql.SQLException : Access denied for user 'hive'@'localhost' (using password: YES) SQL Error code: 1045 For example mysql: CREATE DATABASE hive; CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hive'; GRANT ALL PRIVILEGES ON *.* TO 'hive'@'localhost' WITH GRANT OPTION; FLUSH PRIVILEGES; In Ambari Admin Hive Config Database tab and during Cluster Instal Wizard for Hive, there should be a Test Connection button for Hive Metastore. Use this feature to test the connection during install. Also just to make sure, there is also a requirement for mysql-connector for ambari: To use MySQL with Hive, you must download the https://dev.mysql.com/downloads/connector/j/ from MySQL. Once downloaded to the Ambari Server host, run: ambari-server setup --jdbc-db=mysql --jdbc-driver=/path/to/mysql/com.mysql.jdbc.Driver If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven @ DFHZ
... View more
05-04-2020
07:40 AM
@arunnalpet yes that is correct. I wanted to avoid $set in basic testing and also advocating for valid code practice to avoid issues further with the data source. If the answer above resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven @ DFHZ
... View more