Where/When to Test Web App Labels

Photo by World Travel Adventures

Why do we test our code? The simple answer is to gain as much confidence as we can that a) our code works as expected, and b) our users can accomplish the task(s) they expect to.

But there’s a much larger discrepency / grey area between testing for back-end and front-end. While back-ends typically boil down to input/output workflows, UI’s tend to be… well, more variable and complicated, given user journies that are non-linear and frequently compete with each other.

Front-ends have evolved two main types of tests to help combat the increasing complexity of testing as a project grows: integration tests (eg. testing-library), and e2e tests (eg. playwright). Both have their pros and cons, and deserve a spot in the testing stack. OKie-dokie.

But most development team have a lingering, unanswered question that inevitably leads to lots of sloppy and/or overlapping tests: WHERE DO WE TEST UI LABELS? (Or other copy/text as well.) It’s worth breaking this fundamental question down deeper (ala *5 Why’s) before we try to solve it.

Q: Why do we want to test labels? Do we need to?

Think about this hard. Yes, in theory it’s great to test EVERYTHING, but there is a high cost associated with testing, both implementing and maintaining. Think about your car — wouldn’t be more safe to take it to the mechanic every day to test it? Well yes, but no one does because the cost is too high, both time and money. Even just a daily walk-around inspection would be beneficial, but most of us never do that. If there’s an occasional ding or scratch, then so be it! When we catch it randomly, we’ll deal with it then.
Some text is more important to test then others. Prioritizing a list of these might help.

Q: What is the association of work between changing a label, and updating the test for the label?

If it’s always one-to-one, then is it useful or simply a time-sync of busy-work?

Q: When do labels change? (How flexible/changeable do labels need to be?)

Let’s assume the answer to this question is usually “anytime” or “frequently”. Because really the fact of the matter is, the Product Manager should be able to do market research, and come back to the dev team with ANY cosmetic change to the product, including any text. Whenever.

Q: How willing are we to break tests (or stop deployment) when labels change?

This is a critical cost to weigh in! Seriously. So many test suits begin to fall apart when they start to fail, but nobody on the team cares enough to fix the “non-essential” broken tests — since they’re working on new, critical features after all! Soon after, the false-positive noise is hard to suss out, trust in the testing system starts to spiral downward, and quality takes a back seat as deployment become a sad ritual of superstition and acting “sick” to avoid troubleshooting a bad deployment. (Yes, I’ve seen this happen many times.) Don’t let labels do this to your project.

OK, I think the point is clear. First think carefully about testing labels/text!

Now, let’s say that you’ve made a prioritized list of the critical labels in the system — for example, CTA buttons, page titles, and navigation buttons. Great! Now… should we test these in E2E (i.e. browser level), or integration (i.e. web component level)?

To answer that, let’s explore one more important question — are labels really: A) system configuration, or B) business logic? Here are a couple of key items to consider carefully:

Most big systems adopt i18n, which means labels are no longer defined in-line, but provided via a function call to get back the language’s correct string. For example, in React we might do something like this code: <Button label={i18n("Submit")} />
Should we select components to test via their label that our users will actually see, or a more stable test-id ?

In my experience, this framework has worked out the best (all considered):

LABELS ARE CONFIGURATION (because they are not logic and not hard-coded; instead they are provided via config data, and will change based upon the user’s system that runs it.)
Therefore, prioritized labels should be tested in E2E; this is where we are testing the user journey/experience, and where the browser can be configed to run in multiple languages. The test suite project base should be designed with simple conventions to maximize DRY code/minimize customization for each language’s own “custom” labels.
Integration tests should largely ignore labels/text. (If you want to… you could choose to count how many times your i18n function gets called with a spy. But that’s probably an implementation detail not even worth checking anyways!)
Both E2E and integration tests should only select rendered web components via a stable test-id , to minimize the amount of test suite breakage by otherwise selecting unstable label text (i.e select('[test-id=buy-now]') , not select({text: 'Buy Now!'}) ).

What do you think, do you agree? Hopefully this helps your team understand the pros and cons better. Let me know in the comments.

You can follow me on Twitter and YouTube. Let’s continue interesting conversations about Sr JS development together!

Where/When to Test Web App Labels

To answer that, let’s explore an important question — are labels really: A) system configuration, or B) business logic?

Did you find this article valuable?