How to change the encoding my SFTP server is using?

Preliminary remarks:

  • sftp must use UTF-8 as filename encoding (for example, see here as reference). However, there are clients and servers which don't follow that requirement and violate the specifications, which may be the cause of your problem.

  • You wrote: "Note that I have commented out: AcceptEnv LANG LC_*. According to here, this means the server will not allow the client to pass locale environment variables."

    There may be a misunderstanding regarding how this works and what it is good for. Whenever two machines communicate, they must use the same data formats. For example, suppose that VisualCron puts file names encoded as ISO 8859-1 into the byte stream it sends to the Ubuntu server, but you force the Ubuntu server to interpret the incoming (file name) byte stream as if it was encoded in UTF-8. That will not solve problems, but cause them.

Having said this:

I would at first try to find out where exactly the problem arises. I strongly assume that you have SSH access or even physical access (keyboard) to the Ubuntu server. Then

  • Check whether the locale en_US.UTF-8 is installed on the Ubuntu server at all. Please note that just setting the LC_ and LANG environment variables does not install a locale.

    Instead, you would install a locale during O/S installation or by something like dpkg-reconfigure locales (on debian - I don't know Ubuntu).

  • If using SSH, make sure that your SSH terminal software (e.g. Putty) uses the same encoding as the server.

  • Then, the most crucial step: Using your SSH terminal, manually create a file with a problematic name in the respective directory so that the sftp client on your Windows laptop can see it.

    For example, coming back to your question, you could create a file with name Liège.txt in your /tickets directory (touch /tickets/Liège.txt). Again using your SSH terminal, carefully double check that the file name appears correctly when you let Ubuntu list the files in that directory (ls -al /tickets).

  • Now use the sftp client on your Windows laptop and check whether it correctly downloads that newly created file.

    If this works, it means that your problem arises when VisualCron transfers the files to the Ubuntu server. If it does not work, the problem is between your Windows laptop and the Ubuntu server.

In both cases, there are tools which can help you analyze the situation.

For example, you can gain some insight from playing around with convmv, which can convert file names from one encoding to another. Notably, you could convert the encoding of your file names from UTF-8 to UTF-8. When you do that and you are sure that the file name already is encoded in UTF-8, it must not change during that conversion.

You might also want to have a look into chardet, which is a Python library which tries to guess the encoding of e.g. filenames. I am not a Python guy, so I can't help you with source code. According to the accepted answer to this question, you would have a line like chardet.detect(os.popen("ls yourfilename.txt").read()) in your Python script, which will output the most probable encoding along with a confidence rating.

To summarize:

  • Make sure that the file names on your Ubuntu server really are encoded in UTF-8, by following the steps shown above and using the tools mentioned above.

  • Once you are absolutely sure that the file names on your Ubuntu server are encoded in UTF-8, check whether your sftp client on your laptop can download them. If not, try other clients until it works.

  • If you have found that not all of the file names on the Ubuntu server are encoded in UTF-8, adjust the settings in VisualCron accordingly. I don't know VisualCron and thus can't help you with that.