Sun Jan 13 15:17:51 EST 2019

Hi All,

The recent conversations on using offset() with Unicode strings was very enlightening, thanks to all that took part!.

I have data stored in UTF8mb4. I use textDecode after loading it from the DB to put it into a format that LC understands. I then use offset() to find certain tags, text, etc. to work with. However, if there are emoji in that string, the offset() function hard crashes with a out of range error.

Due to the troubles offset(), I’m looking for a way to remove the emojis before I have to use the offset function.

Short of compiling a list of emoji and the decimal equivalent, does anyone have a way to do this in LC?

My offset code has been rock solid, except for these rare instances were there are emoji in the text and I am not really looking to change it if I don’t have to, preferring to just remove the emoji if possible.


Steve MacLean

