Converting from unicode to ASCII
Richard Gaskin
ambassador at fourthworld.com
Wed Sep 23 11:03:44 EDT 2020
J. Landman Gay write:
> I'm looking for a way to create non-unicode file names
> based on the string that comes out of the database.
Ah, public clouds...
Amazon's S3 docs say just encoding in UTF-8 should suffice, but then
they also list a lot of characters they consider "special", but common
usage considers them not so special at all, so conflicts like this are
apparently abundant.
One workaround for their storage name limitations I've seen used
elsewhere is hash-based names, giving you a string that is plain ASCII,
of a fixed and usable length, and is derived from the file name so
systems don't need to maintain a lookup table to find the file based on
a given string.
This will give you a 40-char string in plain ol' ASCII unique to the input:
function CleanHash s
get binaryDecode("h*", sha1Digest(s), tHash)
return tHash
end CleanHash
e.g.:
get CleanHash("MyFile.txt")
...returns:
d9275b8f757ce47c240d276c1e1192dae8585eba
> ...When the user selects a name from a list, the selection is munged
> to match the server name and the download URL is obtained from the
> cron job's lookup file.
>
> We don't have a field in the database for a file name.
Since a hash is derived from the file name, you don't need to maintain a
lookup table as you would with an arbitrary string like UUID.
If I understand your problem correctly, that file identification need
only be in one direction, just add the hash as part of your existing
munge and you're pretty much done.
--
Richard Gaskin
Fourth World Systems
Software Design and Development for the Desktop, Mobile, and the Web
____________________________________________________________________
Ambassador at FourthWorld.com http://www.FourthWorld.com
More information about the use-livecode
mailing list