I don't currently see a processor that would do this for you, but if you are comfortable with a scripting language such as Groovy, Javascript, or Clojure, you could use ExecuteScript with any of these libraries to infer the character set for an incoming stream. For more information on including these library JAR(s) in your ExecuteScript configuration, see my ExecuteScript Cookbook (part 3) and/or my separate blog post. Since this seems like a good feature to have in NiFi proper, I have created a Jira (NIFI-4550) to add an InferCharacterSet processor.