A/B/C Testing: How to Use Control Groups in Your Split Tests

Unfortunately for many companies, there comes a time when stagnation is the status quo, and cruise control is set on your standard operating procedures. B2B companies are especially susceptible to this because their market is businesses, also set in their ways, and revenue is more predictable than a B2C company.

When revenue is predictable, people get comfortable that everything is working properly and can easily assume, “if it ain’t broke, don’t fix it.”

Every business should have some R&D allocation, to continue pushing, pivoting, and testing everything. If you work in B2B, consider how much your role relies on the way it’s always been done versus the way we tested and found it should be done.

In this article, we introduce A/B testing (or A/B2B testing in this case) and how you can use control groups to ensure your tests are accurate and effective.

What is a Control Group?

A control group is a segment of an experiment that does not receive any exposure to the variables being tested. In pharmaceutical studies, for example, the control group receives a placebo. In software, the control group sees no UI or UX change from the previous version.

If you are testing the impact of a new campaign or policy, you will want to test two options (A and B), but for the best results, you will also want to leave option C (your control group) unchanged. This is why A/B testing should actually be called A/B/C testing.

How B2B Companies Can Use Control Groups in A/B Tests

Let’s say your B2B SaaS business landing page has the same journey that has historically driven revenue quite predictably. No need to test what has worked for years, right?

Wrong! There’s a slim chance your landing page layout was perfectly designed out of pure luck. If an A/B test of button shape, text color, or a number of other changes could produce even slightly better results, why not test it? It’s possible that all this time, your marketing funnel has been limiting itself by a lack of testing.

Let’s say you are testing engagement throughout the user journey, and using A/B tests to differentiate between the two options your team hypothesizes will improve engagement. This is a great test to determine which of the two options performs best, but what about compared to the original version?

Many people assume they can use data from the previous period as a baseline comparison for their A/B tests in another period, but this produces unreliable results. If last month, for example, the feature had the original UI and received standard site traffic from qualified visitors. This month, however, you are A/B testing two options for the new UI, and the site received some unforeseen publicity, driving higher than normal site traffic from unqualified visitors.

Looking at the raw conversion numbers may lead you to think the original UI outperformed the tests, but what the data doesn’t communicate, is how much less likely the unqualified visitors were to convert. This is where a control group would have come in handy.

The inclusion of a control group within the same testing period of your split test will ensure that the best option is clear. In the case of our feature test example, a third of the site traffic will see no UI changes, a third of the traffic will see option B, and a third option A.

After the tests have been completed, the option with the highest level of engagement will be the clear winner. From there, you can use that as your control group to test two other options. This iterative process will ensure your B2B funnel is optimized for success.

Examples of Companies Successfully Split Testing

Some companies are more data-driven than others. Reed Hastings, co-founder and CEO of Netflix, for example, famously established an ethos that everything must be tested before confidently rolling products and features out company wide.

Brian Chesky, the founder of Airbnb, has also inspired his team to take a data-driven approach to most decisions. Airbnb, however, has a unique starting point before scaling ideas. They first do things that don’t scale. In fact, Chesky has famously said they do everything by hand until it becomes painful.

Even though large companies can become set in their ways, yours doesn’t have to. Netflix and Airbnb are prime examples of large companies that will not allow their decisions to run on auto-pilot, or to rest on their laurels because they are on a path of continuous improvement.

As Brian Chesky says, you can also do things that don’t scale, like manually testing variations, to get an idea of what refinements should be made before rolling out a larger scale test. Enterprise-level companies must find a way to stay competitive in the current market and using control groups in your testing strategy is one way to achieve success.