On the dangers of automated refactoring
andre at andregarzia.com
Tue Apr 13 06:05:52 EDT 2021
I’ve recently read that long thread that almost got people banned and will not comment on it. What I want to comment on is about the kernel of the activity that was mentioned there: refactoring.
Often in LiveCode (and most programming languages to be honest) we go coding for a long while and then realise that our code need extensive refactoring. We may have repeated a pattern over and over again and discovered that we need to change every instance of them, or something similar.
There are small cases of refactoring, such as renaming a variable in a single script, that can be easily (and quite safely) be done with find & replace tools. Others are much more complex and attempting to do it over multiple scripts on a large project will result in crying and maybe needing a drink or hug.
An example of a really smart IDE that is considered the most advanced in terms of refactoring are the ones based on JetBrains IDEA such as IDEA itself, Android Studio, webStorms, etc. A key part of these IDEs to do refactoring is that they have deep knowledge about the source code being written. The code is constantly parsed and assembled in a AST that is exposed internally for the refactoring tools. When you refactor code in these IDEs, you’re not really working with text, you’re telling the IDE to manipulate a tree in ways the IDE knows how to manipulate such tree. That is why when you “rename a symbol” or “extract selected code into method in enclosing scope” or whatever you do in these IDEs, you end up with the expected result.
Find and Replace dialogs, or even custom plugins in LiveCode, don’t have the same advanced capabilities. You’re usually working with text and hoping that whatever RegEx you’re applying is error-free. And by error-free I don’t mean it is a “valid regex”, I mean that “it does what you expect, and your expectations are correct”. It is very hard to apply script transformations like that, you can’t be sure they’ll work for every little replacement, and for the cases where it doesn’t work, the bugs introduced might be too subtle to notice. Let me tell you folks a recent story in which I tried to do exactly that and shot myself on the foot.
I’m dealing with a very large LC app. Very large, thousands and thousands of lines spread in a gazilion stacks, behaviors, and libraries. Some of these files needed refactoring. Among the various tasks I needed to do was to apply our “variable naming scheme” to the scripts because there were variables using the wrong prefixes.
Naturally, I tried being smart with find & replace. Even going as far as extracting the script into an editor with more features —such as RegEx replacing— and trying my best to identify and replace the names I needed with vast swoops of RegEx.
All the replacements worked like I wrote them.
What I didn’t realise was that there was variable shadowing happening in which handler arguments were named with the same name as script-local variables, my smart replacing removed those arguments because there was no need to redeclare the script-local vars. I didn’t realise at that time, that those variables were real arguments being passed to the handlers, they just happened to have the same name as script-local vars in the same script and were in fact shadowing them.
I broke all the source code. It took me a long time to work out which handlers needed arguments, and which didn’t need and were actually using the script-local vars.
I tried being smart fixing broken code, and for a while it became more broken.
Refactoring is hard.
We don’t have a system to create and manipulate LC AST.
We don’t even have unit testing libraries so that we can make sure our code works as expected.
Avoid large automated refactoring at all cost, it is not worth it. Do it manually. It will be slower, but it will be safer.
More information about the use-livecode