Getting Kanji from a .csv file

Howard Bornstein bornstein at designeq.com
Sat Jun 8 01:14:20 EDT 2013


Hmmm, I tried what you suggested but it didn't seem to work.

Here's my code with your snippet inserted:

   *put* uniEncode (the unicodeText of field "ConvertedText",  "UTF8") into
 thetext

   *set* the useUnicode to true *-- make numToChar use 16-bit chunks not
bytes*

   *put* numToChar( 0xFEFF ) before thetext

   *put* thetext into URL ("file:Messages.txt")

I noticed that at one point you suggested <put uniDecode( the unicodeText
of field "Processed File", "UTF8" ) into URL ("file:"&kFile2)>


if I change the first line of my code to use uniDecode:

*put* uniDecode (the unicodeText of field "ConvertedText",  "UTF8") into
 thetext

then the entire document shows up as Kanji and lots of garbage characters.
It should contain only English and Kanji.


On Fri, Jun 7, 2013 at 7:24 PM, Dar Scott <dsc at swcp.com> wrote:

> OK, using the Unicode byte order mark as a signature does work for
> TextEdit.
>
> The "byte order mark" is a non displaying Unicode character.  The code is
> U+FEFF.  That is, it is FEFF in base 16, which we write as the numeral
> 0xFEFF in LiveCode.  It is just a big character.  It can be used as a
> pattern, a signature, to indicate what form and encoding scheme the Unicode
> is in in the file.  It can even be used to recognize UTF8 in contrast with
> other encodings.
>
> It is sufficient for TextEdit to decide the file is UTF8.  (TextEdit is
> not that smart and relies on cheats it puts into resources, so the
> signature is important.)
>
> You can put it in front of your unicode data before you put it into the
> field or after.  It is preserved by the field (my worries were for naught).
>
> Just make sure you have it at the front of the file before you save.
>
> Here is how to put it in front of your unicode text:
>
> set the useUnicode to true -- make numToChar use 16-bit chunks not bytes
> put numToChar( 0xFEFF ) before myUnicodeText
>
> That's it!
>
> After you convert myUnicodeText (so named in my example) to UTF8 and save
> it, your file will be 3 bytes bigger than the original (that character is
> expanded to 3 bytes in UTF8).  The file can grow if you keep editing the
> same file, so once you have the above working, work on only adding it if it
> is not already there.
>
> I know this is a lot to take in and I apologize for not being able to
> explain things simply.  Just ask and I will try.  Or somebody who can
> figure out what I'm saying might be able to explain it better.
>
> Dar
>
>
>
> On Jun 7, 2013, at 7:46 PM, Dar Scott wrote:
>
> > Oh, TextEdit cheats.  Did this come from TextEdit?  It puts some info in
> the resource fork.  That is lost when you write back out.
> >
> > I'll ponder this.  Or maybe some OS X resource experts might know.
> >
> > Dar
> >
> > On Jun 7, 2013, at 5:21 PM, Howard Bornstein wrote:
> >
> >>> I don't know what characters the field might throw away.  So, putting
> the
> >>> file into the field and then modifying the field seems scary to me.
>  Maybe
> >>> all the data is there, but maybe not.
> >>>
> >>
> >> The actual command I used was <set the unicodetext of fld
> "ProcessedFile"
> >> to uniencode(fld  "ProcessedFile, "UTF8")> (extraneous "the" in my first
> >> example)
> >>
> >> I had no problems with this. In fact, it processed a file with about
> >> 300,000 lines in just a few seconds.
> >>
> >> And then save the field much like this:
> >>>
> >>> put uniDecode( the unicodeText of field "Processed File", "UTF8" ) into
> >>> URL ("file:"&kFile2)
> >>>
> >>
> >> I tried some variations of this but was not able to save the file from
> >> within LC and still have the Kanji viewable in TextEdit. I guess you
> didn't
> >> read the part about teaching to the imbecile because the rest of your
> >> explanation was way over my head.
> >>
> >> But thanks for trying.
> >>
> >> I would still like to find a way to do this from within LC.
> >>
> >> --
> >> Regards,
> >>
> >> Howard Bornstein
> >> -----------------------
> >> www.designeq.com
> >> _______________________________________________
> >> use-livecode mailing list
> >> use-livecode at lists.runrev.com
> >> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> >> http://lists.runrev.com/mailman/listinfo/use-livecode
> >
> >
> > _______________________________________________
> > use-livecode mailing list
> > use-livecode at lists.runrev.com
> > Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> > http://lists.runrev.com/mailman/listinfo/use-livecode
>
>
> _______________________________________________
> use-livecode mailing list
> use-livecode at lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>



-- 
Regards,

Howard Bornstein
-----------------------
www.designeq.com



More information about the use-livecode mailing list