Member since
07-06-2017
53
Posts
12
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
16006 | 05-03-2018 08:01 AM | |
9816 | 10-11-2017 08:17 AM | |
10632 | 07-20-2017 07:04 AM | |
1193 | 04-05-2017 07:32 AM | |
3090 | 03-09-2017 12:05 PM |
03-21-2017
09:18 AM
Hello, I'm still new to Hadoop technology and I'm struggling defining the best approach for the following 2 similar challenges: In first instance, trying to ingest the files -> Hive. InferAVROSchema in Nifi is limited as it does not always recognize the right data type, generating a fair amount of error when the files are ingested. Switching to specifiy the schema manually bring the following problems: - Ingesting CSV files that have schema updates over the year, I have a versioning documentation giving me the schema changes, however the date in the versioning document do not match the date of effective change. - Ingesting hourly CSV files with a schema depending of the business activity (a set of columns is mandatory, a large set is optionnal and will only be seen when the underling options have been used) . The schema of the files is different from hours to hours, and I can't predict which one is to expect. My feelings are that I have to move to NoSql type of DB / storage, but I'm not exactly sure how to tackle this in the best way. Has anyone faced similar problematic? Thanks Christophe
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
-
Apache NiFi
03-16-2017
10:39 AM
Hi, This bug can have consequences on Spark / Yarn as well. We were encountering Out of Memory conditions running Spark job, not matter how much memory we assigned, we kept ending up exhausting it completely. This behaviour actually disappeared when we applied the fix listed here. I'll post back when I know more about the root cause & link between issues. Regards, Christophe
... View more
03-14-2017
04:21 PM
Hi @Matt Clarke You nailed it again. The two "old" hosts have Owner : CN=FQDN, etc The new host has Owner: CN=<Shortname>, etc Seems the 2 initial hosts cert have been created by a different script that the one I got. It's not clear where is the problem. I'll check and update the cert where needed. Thanks!
... View more
03-14-2017
02:18 PM
Hi @Matt Clarke You nailed it. I was looking at the Service controller @Cluster level, not at Group level. I found it & removed it. Thanks! Christophe
... View more
03-14-2017
01:24 PM
Hello, I found out my Nifi-app.log filled in with errors: StandardControllerServiceNode HiveConnectionPool[id=<masked>] Failed to invoke @OnDisabled method due to java.lang.NullPointerException When the Nifi GUI does not show any Controller Service for HIVE, the ID is also "unknown". Looking the content of flow.xml, i indeed found a definition for the Controller Service which is not referenced anywhere. How can I (safely) remove the definition of the controller? Thanks! Christophe
... View more
Labels:
- Labels:
-
Apache NiFi
03-14-2017
01:19 PM
Hi @Matt Clarke, the 3 nodes principals are located in the same OU, and have the same patterns (servicePrincipalName does include the full FQDN) I can access the 3 nodes by their full FQDN, host 1 & 2 report to ranger as their FDQN, host 3 only by its short name. I'm still struggling to understand why i have different behavior for host 3 Thanks
... View more
03-13-2017
04:18 PM
Hello, I found the cause of this one : the keystore was specified as truststore for Ranger plugin. I missed it while reviewing the configs. Thanks @Bryan Bende!
... View more
03-10-2017
07:48 PM
@bryan bende Thanks for answers. The truststore & keystore listed in the Nifi configuration (xasecure.policymgr.clientssl.*) are the one I checked, containing the right certificates as far a I can tell. The trustore.jks does contain the root CA used to issue the certificates I have again rechecked, and made sure that nifi:hadoop was onwer of the stores, but to no luck. I don't think the JIRa is linked, as in my case, I don;t establish the SSL connection, so I can't possibly yet be impacted by Kerberos Thanks!
... View more
03-09-2017
03:39 PM
@Matt Clarke, Looking at the AD object, I see CN=<name>,OU=<OU1>,OU=<OU2>,OU=<OU3>,DC=<DC1>,DC=<DC2> Which would make my mapping correct, am i right ? Thanks Christophe
... View more
03-09-2017
03:34 PM
@Matt Clarke thanks. I update the tags, I was not too sure actually where to submit this.
... View more