Has Anyone Got A Directory "Walker" Available
alex at tweedly.net
Sun May 6 19:12:59 EDT 2018
On 06/05/2018 20:12, Richard Gaskin via use-livecode wrote:
> Ah, that makes sense. It's not so much the recursion per se, but the
> more general advantage of inline processing vs handler calls.
> In the test case below you know how many levels deep you're going,
> which seems a good fit for inline loops. But how do we handle cases
> like directory traversal, where we don't know in advance how deep
> we'll need to go?
In this test case I do know how many levels - but the code makes no use
of that knowledge, it simply exits when we get down to 0.
For an example of how to handle directory recursion, see Ben's earlier
post in this thread. Or I've included my current version below - I
return a 2-level array of folder / file / data ... rather than a
> It's at least comforting to see that while the overhead of handler
> calls is significant, it's mostly in relative comparison: If I read
> that correctly, there are 1000 iterations of 100 operations, for a
> total of 100k operations. If my coffee is working, that means an
> overhead of just 0.00208 ms per operation when called in a separate
> handler, no? (thinking 273ms total for separate handler - 65ms for
> inline, per 100k operations).
Yep - though it gets higher with more parameters. My recursive 'walker'
needed 4 parameters, so the overhead was higher.
Note that if I was writing a recursive tree-walker now, I'd make it a
private function, so I could keep the parameter checks only at the top
(i.e. public) level, and recurse without repeating them.
I might even move most of the parameters into script-local variables,
since I know the code is interrupt-safe.
> With directory traversal, I'd guess the overhead of physically
> touching each inode would outweigh that manyfold.
Yeah - though it's particularly hard to test adequately. When I was
trying to benchmark it, I found that to get consistent results, I had to
run the test 2 or 3 times and ignore the first, highly variable, run.
Which basically means that I was seeing results with the disk-cache
fully engaged, and therefore likely to underestimate the importance of
the disk operations in any real context.
> Still useful to keep in mind: when inline code can do the job clearly,
> it will be more efficient.
Below is my current non-recursive walker.
NB - it goes to some trouble to allow you to pass in a relative path
name, and maintain that in the result. That is the opposite of what is
commonly done (i.e. using absolute path/file names); I did this to make
it easier to compare multiple trees. If you want the traditional result,
put the defaultfolder into temp -- gives the absolute path
getArrayOfFiles temp, tMyArray
but if you want "my" twist, you'd do
getArrayOfFiles ".", tMyArray
# Produce an array of folders + files with the "full" relative paths
command getArrayOfFiles pFolder, @pA, pIgnoreDotFolders, pIgnoreDotFiles
local tFiles, tFolders
if pIgnoreDotFolders is empty then
put TRUE into pIgnoreDotFolders
if pIgnoreDotFiles is empty then
put TRUE into pIgnoreDotFiles
put the defaultfolder into tThisFolder
local tAbsFolder, tFoldersLeft, tSub
set the defaultfolder to pFolder
put the defaultfolder into tAbsFolder -- changes relative into
put CR into tFoldersLeft
repeat until tFoldersLeft is empty
put line 1 of tFoldersLeft into tSub
delete line 1 of tFoldersLeft
set the defaultFolder to (tAbsFolder & tSub)
if the result is not empty then
-- skip or report as needed
put folders() into tFolders
if pIgnoreDotFolders then
filter tFolders without ".*"
filter tFolders without "."
filter tFolders without ".."
repeat for each line L in tFolders
put tSub & "/" & L & CR after tFoldersLeft
put the detailed files into tFiles
if pIgnoreDotFiles then
filter tFiles without ".*"
repeat for each line L in tFiles
put item 2 to -1 of L into pA[pFolder & tSub][item 1 of L]
More information about the Use-livecode