Unicode is not "everywhere"...
paul at researchware.com
Thu Aug 22 21:07:10 EDT 2019
On 8/22/2019 8:46 PM, Monte Goulding via use-livecode wrote:
> Both of these are anomalies we could only resolve with new syntax I think… at lease the urlEncode one is. I’m not sure if the expectation of shell is it returns text or binary data… The workaround there would be to open process for UTF8 text read instead of using shell... not sure if UTF8 would be right on windows… possibly UTF16 there.
> Regarding url encoding the anomaly bug is https://quality.livecode.com/show_bug.cgi?id=14015 <https://quality.livecode.com/show_bug.cgi?id=14015> so your report should be closed as a duplicate of it I suspect.
> Probably the simplest way to resolve the detailed files/folders issue is to have a new parameter for the files and folders function to return an array. Anyone want to suggest a name for the parameter?
I reported what I thought was 3 bugs in 1 report in
https://quality.livecode.com/show_bug.cgi?id=22213. I have edited that
report to focus on a single bug - that the detailed files (and probably
the detailed folders) is broken for Unicode as every Unicode character
in a file name is encoded as %3F or ?. Originally I had though that
meant that there was a problem with urlEncode and urlDecode as per bug
https://quality.livecode.com/show_bug.cgi?id=14015 that your references
Monte. However, I now consider that urlEncode and urlDecode are NOT
broken and bug 14015 is really a Documentation bug that the urlEncode
and urlDecode Dictionary entries should be updated.
If you look at the Wikipedia entry for URL (or percent) encoding, it
states that the standard practice for non-ASCII character in a URL is to
encode them as UTF8 BEFORE percent-encoding. There fore, urlDecode and
urlEncode are working correctly by the accepted standards. The
Dictionary entries need updating to note that any non-ASCII text should
be UTF8 encode before urlEncoding and UTF8 decoded after urlDecoding.
See my DOC bugs:
I believe these replace bug
Also, the detailed files and detailed folders doesn't need an array
returned (although that would be nice). This issue is the
percent-encoding is not following the standard convention of UTF8
encoding non-ASCII characters before percent encoding. LC should follow
industry conventions in this regard.
More information about the use-livecode