Finding words with diacriticals
dunbarx at aol.com
dunbarx at aol.com
Sun Mar 15 17:48:23 EDT 2020
I may not really understand what you want, but doesn't the "Find string" variant solve your problem?
If you have a field 1 with "cat" on line 1, with "cât" on line 2 and "cat" on line 3, that is, the line 2 "cat" has charToNum(137) in place of the standard "a".
on mouseUpfind string numToChar(137) in fld 1put the number of words in char 1 to word 2 of the foundChunk of fld 1 into tempanswer "Word" && temp && "="&& word temp of fld 1end mouseUp
The point being that once you have the result of "find String", you can engineer all the other stuff you need, such as the words that contain the odd char, the lines they reside in, etc.
From: Peter Bogdanoff via use-livecode <use-livecode at lists.runrev.com>
To: How to use LiveCode <use-livecode at lists.runrev.com>
Cc: Peter Bogdanoff <bogdanoff at me.com>
Sent: Sat, Mar 14, 2020 7:48 pm
Subject: Finding words with diacriticals
I have a text search that in which I’m trying to improve the UI.
I have this text:
Edgard Varèse (Poème électronique) was a pioneer in the application of tape recording technology to composition.
The search database, built with Scott McDonald’s rrpSearch plugin, can only be searched using the exact characters. So, I’m building a supplementary array of words with alternate spellings that the user might type in the search box. I would reference the array to get an equivalent word and so provide the user with a usable result.
So if the user types in “poeme” — I would find “poeme” in the array and its equivalent “Poème” and I would actually search for “Poème” — and the user would get a result that included “Poème”.
So I want to build this array of word equivalents. The search database is built by rrpSearch from text on cards, so I have to go back to these cards to get my data. I’m using the find command to search cards to find every instance of “è” or “é” or “ü” or “î” or whatever. There are many non-English words in the text. The foundText function should give me the words that contain that character—except it doesn’t in every case. It only finds words that BEGIN with the search text. So
électronique — found (char begins the word)
Varèse — not found (char is in middle of the word)
Poème — not found (char is in middle of the word)
I’m using “find” and “the foundText” which returns the whole word that contains the search character. No other form of find will return the whole word. The dictionary for foundText:
<For example, the command find "hurl" can find any word that starts with the string "hurl", such as "hurling" or "hurler". In this case, the entire word --not just the portion specified in the find command --is surrounded by a box, and the foundText returns the entire word.>
Is there another relatively simple way to get the whole word in which the desired characters live? There are dozens of fields on thousands of cards to search.
(I realize that there are far better ways to handle a search, and in the future, I will have a database that I will design myself--but not yet.)
use-livecode mailing list
use-livecode at lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
More information about the use-livecode