Linux filenames in LC Server

matthias_livecode_150811 at m-r-d.de matthias_livecode_150811 at m-r-d.de
Tue Aug 15 04:06:51 EDT 2023


What definitely works, at least here,  is to urlencode the filename before creating it
So that e.g. testä would be created as test%E4
As urlencode does not "harm" you could use it in general, not only for non-ascii file names. 
And if you want to display the "real" name you just have to urldecode the filename again.





> Am 15.08.2023 um 09:42 schrieb Neville Smythe via use-livecode <use-livecode at lists.runrev.com>:
> 
> Thanks Mark and Matthias
> 
> I think it is clear the problem is not related to variant forms - if I replace [e-acute] by any other non-ascii character, such as a Kanji character or emoji, I get the same “can’t open that file” error. And the weird decoding of [e-acute] to [E-grave] would be explained if textDecode is failing in LC Server.
> 
> So if I understand Mark correctly, while one can create utf-8 encoded filenames directly in a terminal session,  LC Server internally accesses Apache environment variables to encode/decode the filename before opening a file rather than directly using the shell. Presumably this has something to do with the engine being a server app having to respect the server environment.  
> 
> On Dreamhost, as far as I can determine, the LANG and LC-ALL variables are *not* set (though WordPress is running and it adds support for a swathe of languages, so surely has support for non-ascii filenames?) The site is a shared hosting, so I do not have permissions to change the Apache conf files. I tried adding the SetEnv commands in the .htaccess file but that didn’t work, although I could well be doing it wrong, I am fumbling around in the dark here.
> 
> Unless there is some way to fix the configuration, it would seem that not only will opening files fail but the detailed files (the long files) command will also fail if non-ascii characters are encountered since it uses textEncode. I presume that using shell commands could be used as a workaround for accessing the filesystem, as long as LC doesn’t do an internal textEncode as it passes the variables to the shell! 
> 
> However it also means one cannot use textDecode/Encode at all, not just for the filenames but also content; and that could be a bummer. I haven’t encountered this so far because to this point I have encoded content before uploading binary files to the server, but I can envision situations where I would want to encode or decode server-side.
> 
> I’m puzzled that this problem hasn’t been raised before. Surely the vast majority of website host providers use Linux servers, and the Dreamhost configuration for shared hosting is most likely standard. So has no-one in Europe (or Asia..) using LC Server wanted to create native-language filenames? I think LC Server is a magnificent tool, but perhaps it is not as widely used as it deserves! Or: they all found the fix and haven’t told us.
> 
>> So, when you run lc-server from a terminal session directly, its almost 
>> certainly the case that the LC_ALL and LANG environment variables are 
>> set to en_US.UTF-8 (or some other language code DOT UTF-8 - it is the 
>> UTF-8 which is the important bit).
>> 
>> On Linux, a C API nl_langinfo() is used to fetch the encoding to use 
>> when talking to the system APIs (e.g. filesystem APIs) - this (I 
>> believe) derives its information from LANG/LC_ALL.
>> 
>> If the latter *are not set* then it will likely default to the 'C' 
>> locale which has no interpretation of any non-ascii chars, and thus 
>> attempts to encode/decode utf-8 encoded filenames will fail.
>> 
>> My theory is that these variables are not set in the configuration for 
>> running CGIs in Apache (or whatever web server is being used in this 
>> instance).
>> 
>> Digging around it looks like Apache (at least) has a `SetEnv` directive 
>> which would allow these environment variables to be set, e.g.
>> 
>>  SetEnv LC_ALL en_US.UTF-8
>>  SetEnv LANG en_US.UTF-8
>> 
>> Although I'm not 100% sure where such things go, perhaps someone more 
>> conversant with apache config could chime in to suggest.
> Neville Smythe
> 
> 
> 
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode



More information about the use-livecode mailing list