Line numbers for soft-wrapped styled text?
hh
hh at hyperhh.de
Sun Apr 2 20:24:17 EDT 2017
Alex,
a brilliant technique. It is certainly usable for other "searching"
tasks. And it simply needs for this use case a faster measurement-method
as input. We could try the selectedLoc:
The script below gets for me the exakt topLefts of the visible ones of
20000 "worse case"-lines in < 1 second, with LC 9, and in LC 6 it is
three times faster, on a medium fast machine (2.5 GHz macMini).
I'm sure you can take the following quick and dirty routine again down
to at least 50% of needed time with your optimization technique.
Hermann
## Commented for occasional readers.
## You still has to use the locations for a numbers display: Use either
## one num-field per line or one for all using Alex's method of space below.
local rg=TEXT", l0, t0, b0, v0
local gpoints -- collects the toplefts of the lines
on updateNbs2
put the millisecs into m1
lock screen; lock messages
put the top of fld rg into t0
put the left of fld rg into l0
put the bottom of fld rg into b0
put the height of fld rg into h0
put the selectedChunk into sc
put the vscroll of fld rg into v0
put the num of lines of fld rg into nL
put visibleTextLines(nL) into tL
put "Lines: " & tL & " of " & nL into fld "range"
put gPoints into fld "info" -- ready to use
-- avoid bug(?) with the field num:
if sc is not empty then
do ("select "& (word 1 to 4 of sc) &" of fld ""e& rg "e)
end if
set the vscroll of fld rg to v0
put (the millisecs-m1) & " ms" into fld "timing"
unlock screen; unlock messages
end updateNbs2
-- returns the visible lines range and (in gPoints) the toplefts
-- n is the num of lines
function visibleTextLines n
put the scrollbarWidth of fld rg into sw
put the margins of fld rg into m
put (m,m,m,m) into m -- now we have always at least 4 items
put findTopLine(v0+t0-1+item 2 of m,1,n) into L1
put L1-1 into L2; put v0+b0+6-item 4 of m into x
put empty into gPoints
-- now a simple line-after-line check could/should be made faster,
-- its advantage: We get at the same time already the locations!
repeat while L2 <= n
add 1 to L2
set vscroll of fld rg to v0 -- important for measuring
select before line L2 of fld rg
put item 2 of the selectedLoc into slc
put cr & (l0,slc) after gPoints -- adjust here the left
if (the vscroll of fld rg) + slc > x then
put line 2 to -2 of gPoints into gPoints
exit repeat
end if
end repeat
return L1,L2-1
end visibleTextLines
-- the simplest recursive method as base for optimization
function findTopLine x,n1,n -- x=top pixel value, n1=start, n=max
put n1+((n-n1) div 2) into m
set vscroll of fld rg to v0
select before line m+1 of fld rg
put the vscroll of fld rg + item 2 of the selectedLoc into vsl
if vsl >= x then
set vscroll of fld rg to v0
select before line m of fld rg
put the vscroll of fld rg + item 2 of the selectedLoc into vsl
if vsl < x then return word 2 of the selectedLine
else
if m <= n1 then return n1
else return findTopLine(x,n1,m-1)
end if
else
if m >= n then return n
else return findTopLine(x,m+1,n)
end if
end findTopLine
## The field's script is again:
on textchanged
updateNbs2
end textchanged
on scrollbardrag
updateNbs2
end scrollbardrag
### END_OF_SCRIPT
> Alex T. wrote:
> Sure, here it is.
>
> I couldn't resist doing some simple benchmarks, just to verify my
> intuition. Very glad I did.
>
> OK - here's the theory :
>
> we're using variations of binary search to find the lowest numbered
> line that is visible, and then again to find the highest numbered line.
>
> So at each stage we have an interval within which the right answer must
> lie.
>
> 1. Simple binary search : the next guess is the middle of the current
> interval.
>
> This is simple, and very consistent for the number of guesses needed:
> the worst case is almost identical to the average/typical.
>
> For my test case (25000 lines of heavily styled text), we need approx 14
> guesses for the first calculaiton, and 8 for the second.
>
> 2. Simple Newton-Raphson (linear interpolation).
>
> This is usually better than simple binary, *if* there is a reasonable
> correlation between the measurable values and the guessable values. In
> this case there - the height of any chunk of lines is reasonably
> correlated to the number of lines, though not exactly in all cases.
>
> Usually this will take significantly fewer guesses (and in this case it
> takes around 8 + 4 compared to the 14+8 above).
>
> BUT - the worst case can be much, much worse :-( Imagine a field with
> 1000 lines - the first 500 are in 5-point text, and the other 500 are in
> 1000 point text, scrolled forward by 400 lines; this makes our guessing
> VERY poor, and will take up to 200 or more guesses.
>
> 3. Use a mix of linear interpolation and binary search. I tried simply
> alternating them - one calculated guess, then one 'average' guess, then
> another calculated one, ....
>
> This should significantly limit the worst case - but makes the average
> somewhat worse.
>
> 4. Use linear interpolation - but disallow very small (or very large)
> estimates.
>
> This again limits the worst case (not quite so well), and makes the
> average only slightly worse.
>
>
> I tried each of these strategies on a range of scroll positions in my
> test case field. Results were
>
> (1) Binary 6283
>
> (2) N-R 3757
>
> (3) N-R/binary 4200
>
> (4) N-R limited 3782
>
>
> So - the script below does (4) above - linear interpolation, with the
> very small or very large percentage guesses disallowed. But, I've left
> in the code (commented out) to also do alternating binary guesses, so it
> can be easily used if your use case is very sensitive to extreme examples.
More information about the use-livecode
mailing list