AI & ML

Mastering Test Doubles: Enhancing Your Testing Strategy for Reliable Code

Learn the five types of test doubles to improve your testing strategy and ensure reliable code without misconceptions in testing terminology.

Jun 12, 2026 3 min read
Sign in to save
[Originally published by Jakub Sobolewski, this article is also available on R-bloggers. For issues regarding the content, contact us here.]
If you want to contribute to R-bloggers, click here to add your blog, or here if you don’t have one yet.

Test Doubles Taxonomy for R: Dummy, Stub, Spy, Mock, Fake

We often misuse the term “mock” in testing. A misnomer that encompasses many types of test doubles, it’s a slippery slope that can lead to confusion. Whether you’re mocking a database, an API, or a class function, that's just the tip of the iceberg. Simplifying all these as “mock” doesn’t just obscure their differences; it can worsen your tests' reliability. Using the wrong type of double can lead to fragile tests that may give you a false sense of security.

Understanding the five distinct categories of test doubles can dramatically improve your testing strategy. Each type has a specific function, and mastering their use is key to writing effective tests. Misapplication can lead to tests that provide the wrong assurances.

Understanding the Code You’re Testing

To illustrate these types, we’ll analyze a function named process_payment. This function takes care of charging a card, logs each attempt, and can notify the customer if required.

process_payment <- function(order, payment_gateway, logger, notifier = NULL) {
logger$log(paste("Processing order", order$id))
result <- payment_gateway$charge(order$amount, order$card_token)
if (!result$success) stop("Payment failed: ", result$error)
if (!is.null(notifier)) {
notifier$send(order$customer_id, result$transaction_id)
}
result$transaction_id
}

This function depends on three components: payment_gateway, logger, and notifier. Depending on what you're trying to examine in your tests, each of these can be substituted with various types of test doubles.

1. Dummy

💡 Definition: A dummy is an object used merely to fulfill a parameter requirement but is never utilized by the test.

Since process_payment requires a logger for its functionality, using a dummy is fitting when you only want to verify the return of the transaction ID. In this scenario, the actual logging won't impact your test results.

test_that("returns the transaction ID on successful payment", {
# Arrange
order <- list(
id = "ord-1",
amount = 100,
card_token = "tok_visa",
customer_id = "cust-42"
)
dummy_logger <- list(log = function(...) invisible(NULL))
stub_gateway <- list(
charge = function(amount, token) {
list(success = TRUE, transaction_id = "txn-abc")
}
)
# Act
result <- process_payment(
order,
payment_gateway = stub_gateway,
logger = dummy_logger
)
# Assert
expect_equal(result, "txn-abc")
})
Test passed with 1 success 🥇.

In the above example, dummy_logger serves its purpose without any functionality. If your dummy were to throw errors or behave unexpectedly, it would indicate that the code you’re testing genuinely demands its presence.

This type of understanding is an invaluable tool in your arsenal.

2. Stub

💡 Definition: A stub is an alternative that provides pre-defined outputs, allowing you to manipulate the inputs your code receives.

When you next test how process_payment behaves if a payment fails, you don’t need to connect to an actual payment API. A stub can hand you the failure response you’re looking for.

test_that("throws an error when payment is declined", {
# Arrange
order <- list(
id = "ord-2",
amount = 200,
card_token = "tok_declined",
customer_id = "cust-7"
)
dummy_logger <- list(log = function(...) invisible(NULL))
stub_gateway <- list(
charge = function(amount, token) {
list(success = FALSE, error = "insufficient funds")
}
)
# Act & Assert
expect_error(
process_payment(
order,
payment_gateway = stub_gateway,
logger = dummy_logger
),
"insufficient funds"
)
})
Test passed with 1 success 🥇.

A stub influences the code under test, allowing you to verify expected actions based on the responses it receives. For process_payment, this design choice to pass in a payment gateway is called dependency injection. It’s essential because it provides flexibility for testing without affecting the entire infrastructure.

For those practicing test-driven development, you’ll quickly realize that stubs are invaluable. Without dependency injection, you’d struggle to isolate your tests effectively. When you're able to easily substitute components, you gain tidy, manageable tests.

3. Spy

💡 Definition: A spy functions as a stub with the added capability of tracking calls made, allowing for post-test assertions.

When your primary focus is on side effects rather than returned values—like logging or notifications—a spy can be your best bet. It enables you to capture what’s happening in a test scenario.

make_notifier_spy <- function() {
calls <- list()
list(
send = function(customer_id, transaction_id) {
calls[[length(calls) + 1]] <<- list(
customer_id = customer_id,
transaction_id = transaction_id
)
},
calls = function() calls
)
}
test_that("notifies the customer after successful payment", {
# Arrange
order <- list(
id = "ord-3",
amount = 50,
card_token = "tok_visa",
customer_id = "cust-99"
)
dummy_logger <- list(log = function(...) invisible(NULL))
stub_gateway <- list(
charge = function(amount, token) {
list(success = TRUE, transaction_id = "txn-xyz")
}
)
spy_notifier <- make_notifier_spy()
# Act
process_payment(
order,
payment_gateway = stub_gateway,
logger = dummy_logger,
notifier = spy_notifier
)
# Assert
expect_length(spy_notifier$calls(), 1)
expect_equal(spy_notifier$calls()[[1]]$customer_id, "cust-99")
expect_equal(spy_notifier$calls()[[1]]$transaction_id, "txn-xyz")
})
Test passed with 3 successes 🌈.

Unlike mocks, spies don’t dictate behavior—they merely record interactions. If you want to validate that an action takes place, spies are a clean solution since they allow you to inspect what actually occurred.

4. Mock

💡 Definition: Mocks are sophisticated doubles programmed with exact expectations regarding the calls they should handle. They can throw exceptions if they encounter unexpected inputs and are validated during the verification stage of your tests.

Using mockery::mock(), though, lacks that strictness—you’ll need to verify call counts and arguments manually. It's a pet peeve because while mocks are handy, they can lead to overspecification if used indiscriminately.

test_that("sends exactly one notification with correct arguments", {
# Arrange
order <- list(
id = "ord-4",
amount = 75,
card_token = "tok_visa",
customer_id = "cust-11"
)
dummy_logger <- list(log = function(...) invisible(NULL))
stub_gateway <- list(
charge = function(amount, token) {
list(success = TRUE, transaction_id = "txn-def")
}
)
mock_notifier <- list(send = mockery::mock())
# Act
process_payment(
order,
payment_gateway = stub_gateway,
logger = dummy_logger,
notifier = mock_notifier
)
# Assert
mockery::expect_called(mock_notifier$send, 1)
mockery::expect_args(mock_notifier$send, 1, "cust-11", "txn-def")
})
Test passed with 5 successes 🥳.

When the interaction shapes the essence of what you're testing, a mock fits the bill. Just be cautious—not discerning about where to deploy it can turn any test into a fragile one, easily broken by internal changes.

5. Fake

💡 Definition: Fakes represent a simplified version of a working implementation, designed for tests rather than production use.

Unlike stubs, fakes simulate real behaviors and manage states across multiple interactions. They offer a solid option when you require test scenarios that involve more than a single transaction.

make_fake_payment_gateway <- function() {
transactions <- list()
list(
charge = function(amount, token) {
if (amount <= 0) {
return(list(success = FALSE, error = "invalid amount"))
}
if (token == "tok_declined") {
return(list(success = FALSE, error = "card declined"))
}
id <- paste0("txn-", length(transactions) + 1)
transactions[[id]] <<- list(
amount = amount,
token = token
)
list(success = TRUE, transaction_id = id)
},
find = function(transaction_id) {
transactions[[transaction_id]]
}
)
}
test_that("successful charges are recorded in the gateway", {
# Arrange
order <- list(
id = "ord-5",
amount = 120,
card_token = "tok_visa",
customer_id = "cust-3"
)
dummy_logger <- list(log = function(...) invisible(NULL))
fake_gateway <- make_fake_payment_gateway()
# Act
txn_id <- process_payment(
order,
payment_gateway = fake_gateway,
logger = dummy_logger
)
# Assert
recorded <- fake_gateway$find(txn_id)
expect_equal(recorded$amount, 120)
expect_equal(recorded$token, "tok_visa")
})
Test passed with 2 successes 🎊.

Fakes excel in scenarios with intricate interactions, like order processing and querying transaction statuses. They mimic realistic behavior and are suited for acceptance tests where you want dependencies that operate holistically rather than with limited preset responses.

Building and maintaining fakes does require more effort compared to simpler alternatives, but they are worth it for stable or frequently tested interfaces. For one-off unit tests, however, a stub might be sufficient and simpler.

Choosing the Right Type of Test Double

Summing Up: The Art of Test Doubles

As we wrap up our exploration of test doubles, it becomes clear that understanding their nuances is essential for effective testing. The division between stubs and mocks is particularly enlightening. A stub is your go-to for asserting outcomes based on returned values, while a mock is there to validate that specific interactions took place. Misusing these can lead to technical debt and fragile tests that are tightly coupled to implementation details. This misstep can shove you down a rabbit hole of maintenance woes. Here’s the essence: if your focus is on return values, go with stubs. But if you need to confirm interactions—like a method being called with the right parameters—spies or mocks are your allies. It’s also important to apply fakes when working with dependencies that maintain state across multiple calls. This nuanced approach gives you the power to achieve both isolation in your tests and integrity in your application's behavior. And while we’ve laid out a clear framework, the implementation of an eager mock, as showcased, emphasizes that testing isn’t just about avoiding errors but actively engaging in verification. The nuances in test double types aren't just academic theory; they’re practical tools in real-world programming, enabling developers to assert that their code behaves as intended. So, if you’re engaged in development or testing, remember that the right choice of test double can be foundational. Dive deep into their characteristics, and you'll see immediate benefits in your testing strategy. This isn’t just about following best practices; it’s about enhancing the efficiency and reliability of your codebase in a landscape that increasingly prioritizes robust testing methodologies. For those who want to dig deeper, refer to resources like Martin Fowler's insights on [Test Doubles](https://www.martinfowler.com/bliki/TestDouble.html) and Gerard Meszaros’s comprehensive breakdown of [Test Double Patterns](http://xunitpatterns.com/Test%20Double%20Patterns.html). Their guidance can provide additional context and help solidify your understanding of these critical testing tools.
Source: Jakub Sobolewski · www.r-bloggers.com

Comments

Sign in to join the discussion.