Guessing the encoding of a test file...
mark at livecode.com
Fri Mar 20 03:18:23 EDT 2020
Rather than throwing ‘the baby out with the bath water’ so to speak... What are the precise cases in which the method you have fails? And why do you expect it to work in those cases?
Sent from my iPhone
> On 19 Mar 2020, at 20:32, Paul Dupuis via use-livecode <use-livecode at lists.runrev.com> wrote:
> This has come up many times before, but I'll ask once again in case something has changed or someone new sees this.
> Does anyone have a routine that will take a filespec to a text file and return the guessed encoding of the text file?
> First, please don't respond with your should know the encoding or the users should know the encoding of their files. Not possible in the widely uncontrolled real world.
> I do already have a routine to guess file encodings. It was written by someone else. There are instances where it should work and does not. I fear there may be errors in the algorithm and I do not have the original algorithm to check it against. Hence, I am looking for an alternative that is either free to use or to be licensed for a modest fee.
> My current routine attempts to return the encoding as a string that can be directly passed to textDecode(binaryData,encoding)
> "CP1252" *
> "MacRoman" *
> * for these last 2, if the file is MacRoman on a Windows system, you actually have to textDecode(macToISO(data),"CP1252") and if you have CP1252 on the Mac, you need to do textDecode(isoToMac(data),"MacRoman"). There is an enhancement request to support MacRoman decoding under WIndows and vice versa at https://quality.livecode.com/show_bug.cgi?id=22391 if you want to CC yourself to show interest.
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
More information about the use-livecode