Doing user testing in a risk free way (Re: gitter)
Mark Waddingham
mark at livecode.com
Sun Jul 30 01:16:04 EDT 2017
On 2017-07-30 01:04, Monte Goulding via use-livecode wrote:
> I’m actually not sure I see the connection between the ide submodule
> and the 8.1.6-rc-2.
Heh - the connection is indirect, however it is definitely there!
**The goal:
We want to do A/B testing on LiveCode's 'first run' experience. This
experience has to go from finding the site, all the way through to
downloading, installing, activating and actually using the product to
ensure the data that we get is of the highest, and most useful quality.
The experience any new user (who is being used as a guinea pig) has to
be indistinguishable from all other new user's except for the specific
test flow they have been allocated to.
**The problem:
Iteration on develop branch is too slow and we can't guarantee with a
high degree of confidence that any one DP would be in any way suitable
for a brand new user to use. (We still reserve the right to make
absolutely blooping mistakes in develop DPs - obviously we try not to,
but we need to be able to test out invasive changes somewhere, and that
is where we currently can!).
We have a well-defined and tightly controlled process for our
maintenance branch. It is really important that this is kept as it is,
and it is definitely *not* the place to do any sort of testing of any
features whatsoever.
For A/B testing to work, tests need to be able to be rolled out quickly,
iterated quickly, and pulled quickly if they are unsuccessful, or
indeed, ill-posed.
**The primary observation:
A/B testing is centered around the outermost, user-visible parts of any
release. Specifically, the IDE and supporting materials. Any engine work
that might be required will either be new (internal) additions to
support user-visible work or bug fixes (almost always the latter).
**The proposed solution:
All A/B tests are built *upon* but not *into* the maintenance builds.
If a specific test *requires* a bug fix, it will be placed in first RC
of the next maintenance build. If this means a specific test has to be
deferred a short while it can be deployed then that is what must happen.
(We have plenty of tests we could run, after all - it is not like we are
going to run out anytime soon).
If a specific test *requires* an internal engine addition to achieve,
then suitable analysis will be done to make sure it can and does have
zero impact on the potential stability of a maintenance build.
Basically, these things are constrained to internal features which are
entirely 'bolted' on to the existing engine and have no interoperation
with the existing code and ideally such things would be a widget/library
or a similar extrinsic addition.
A test will be created by making a fork of the target maintenance
release branch in the IDE repository. All work for the specific test
happens on this branch. All active test branches are kept up to date, by
merging progress of the underlying maintenance branch/release branch
(e.g. develop-8.1 or release-8.1.7) into them.
Tests are not built into the maintenance releases - the entire IDE
bundle for that test will be uploaded as a separate archive.
When a new user who is being put through a test generates a license, the
license will be marked with the name of the test.
When the user activates and launches the IDE for the first time, the
appropriate test (IDE) archive will be downloaded and *that* version of
the IDE will be launched, rather than the one built into the release.
If a test is successful, it will be merged into the next *suitable*
version.
For small entirely first-run related things this might well be a
maintenance release (e.g. new first run tutorials, new materials, new
default preferences for first-run users).
For larger things (such as autocomplete or handler lists) we will use a
bump in *middle* version number of semver, and use a DP release to make
sure we got it absolutely right.
Importantly the latter has no impact on what new users see. We can merge
all successful tests into all future tests so we our test profiles build
on what we have done before. i.e. New users will see the combined set of
successful tests; whilst existing users will be able to help us finesse
them and ensure that they are as good as they can be before appearing in
a minor version update; rather than a maintenance update.
If a test is unsuccessful, the test just gets deleted, the code for it
is never merged (but of course, we'll keep the code - it might be
something we might want to test again in the future).
**Why it works:
The issue which occurred with 8.1.6-rc-2 was due to the fact we were
iterating on the actual releases which existing users rely on. This
approach means that existing users will not see, nor even have the code
in their system for any of the tests which we run. We can keep the
strict process for maintenance releases we have always had, whilst still
using them to iterate rapidly in the first-run A/B testing side.
**What is the connection to the IDE submodule?
If the IDE were rolled into the engine repository, the above would not
work. We would not be able to easily guarantee that an A/B test *was*
actually based on a line-for-line identical version of the engine as the
maintenance release. It *physically* forces us to observe the boundaries
laid out above - which means that the risk of error is diminished to
close to zero - everything can be tested where it should be, in the
proper process.
Of course, it would work *better* if we had our repo structure the right
way round - the IDE should really have the engine as a submodule.
However, nothing is perfect, and the current situation can be made to
work perfectly well, even if it is not as pure as it could be.
**How can all our existing users help with our user testing?
Well, the above approach has one further *really* useful aspect. For any
licenses which are not marked with a test funnel (or indeed are marked
with a dead test funnel) we can show a drop-down list in preferences for
all currently available tests.
Choosing one would reboot the IDE into that test's IDE. The result? Any
of you can play around with the things we are testing and give us
feedback on them without it affecting your workflow or use of the IDE at
all.
**When will this happen?
Well, soon, hopefully, We are already looking to do a 8.2.0-dp-1 with
the autocomplete features as we don't really have time to get this new
process up and running for our next new user testing session - and
neither Ali nor I are in any way comfortable with doing it 8.1.6 style.
We *think* we have time to do an 8.2.0-dp-1 and *hopefully* an
8.2.0-rc-1 in that timescale if you all don't mind helping us ensure the
feature is sound (Monte has done a great job on it so far - so hopefully
the fact it is autocomplete will be enough of an incentive to try out
these builds and rapidly help us get it to RC state).
So, yes - Mark and Dan you were right - I cannot say that you weren't.
Our previous approach was the wrong thing to do, and we only did it that
way because we couldn't see how to do it in another way. Well, I think
we have now, as the above explains.
Warmest Regards,
Mark.
--
Mark Waddingham ~ mark at livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps
More information about the use-livecode
mailing list