Support Questions

Find answers, ask questions, and share your expertise

Unable to write a file with Chinese Characters filename in windows share ( samba)

avatar
Contributor

I am currently running apache nifi in centos 7, and mounted some network drives by samba ( windows share ), then in the nifi data flow, I use putfile to write some files with Chinese Characters filename into the network drives by samba.

Then I got this error

PutFile[id=e0ae7c13-0162-1000-f401-ecb9b39fa9de] Penalizing StandardFlowFileRecord[uuid=04563401-54ff-4a1f-9117-02e4472ff0c4,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1525659272432-6, container=default, section=6], offset=0, length=6981],offset=0,name=用户划款授权通知书.pdf,size=6981] and transferring to failure due to Malformed input or input contains unmappable characters: .用户划款授权通知书.pdf: java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: .用户划款授权通知书.pdf

Please advise how to do.

1 ACCEPTED SOLUTION

avatar
Contributor

This is fixed by doing below:

  1. Set default encoding using the JAVA_TOOL_OPTIONS environment variable: (nifi-env.sh) <-- currently this one not implemented
export JAVA_TOOL_OPTIONS=-Dfile.encoding=utf8
  1. Add default encoding parameter to NiFi’s bootstrap.conf file:
java.arg.8=-Dfile.encoding=UTF8

Of course, adjust the argument’s number according to your configuration.

And in nifi-env.sh

export LANG="en_US.UTF-8"
export LC_ALL="en_US.UTF-8"

View solution in original post

2 REPLIES 2

avatar
Contributor

Just tested, I are not able to write to local path "/tmp" neither

avatar
Contributor

This is fixed by doing below:

  1. Set default encoding using the JAVA_TOOL_OPTIONS environment variable: (nifi-env.sh) <-- currently this one not implemented
export JAVA_TOOL_OPTIONS=-Dfile.encoding=utf8
  1. Add default encoding parameter to NiFi’s bootstrap.conf file:
java.arg.8=-Dfile.encoding=UTF8

Of course, adjust the argument’s number according to your configuration.

And in nifi-env.sh

export LANG="en_US.UTF-8"
export LC_ALL="en_US.UTF-8"