Product Experimentation Techniques - The Mechanical Turk

product-experimentation-techniques-mechanical-turk.png

Use the Mechanical Turk experimentation technique to test your solution before you build it. Use human intelligence behind the scenes to simulate the core experience of your product. Get early customer feedback to reduce risk and learn faster.


You may also enjoy these articles:



Introduction

Building complex technical solutions requires a big investment in time and money. Consequently, there’s a lot of business risk involved.

In these types of scenarios its best to test your solution with your target market before building so that you can start to validate your assumptions.

The Mechanical Turk is a type of experiment that allows you to test your solution before you have built the finished product.

Fake it, before you make it, is the theme of the Mechanical Turk experiment. You want customers to think that they’re interacting with the real deal.

Customers are presented with a technical solution, while behind the scenes ‘human labour’ is what makes the solution work.


In this article we’ll discuss the following:

  1. Origins of the Mechanical Turk

  2. Why use Mechanical Turk experiments?

  3. Examples of Mechanical Turk experiments?

  4. The challenges with Mechanical Turk experiments


1. Origins of the Mechanical Turk

Overview

The Turk, Mechanical Turk or Automaton Chess Player was a chess playing machine that was constructed in the 18th Century.

The Mechanical Turk was first unveiled in 1770 by Wolfgang von Kempelen to impress empress Maria Theresa of Austria.

The machine appeared to be able to play a strong game of chess against a human opponent, while also perform the Knight’s Tour, a puzzle that requires the player to move a knight to occupy every square of the chessboard once.

The Mechanical Turk won most of its demonstrations around Europe and the Americas over a period of 84 years.

The Mechanical Turk defeated many strong challengers, famously beating Benjamin Franklin and Napoleon.

How it worked

All was not as it seemed. The Mechanical Turk was an illusion.

The machine consisted of a life-sized human head and torso, dressed in Ottoman robes and a turban. On top of the cabinet was the chessboard.

Inside the cabinet were a series of drawers and compartments. The inner workings of the machine were deliberately complex, designed to mislead the audience when viewing it.

When opening the cabinet on the left, there was a series of gears and cogs. When the Mechanical Turk moved a chess piece, the cogs and gears would move like clockwork machinery, providing the appearance of mechanical automation.

On the right-hand side of the cabinet was an empty compartment. This area was purposefully designed to provide an unobstructed view from front to back, further convincing the audience that no deception was involved.

The design allowed the presenter to open all the drawers and compartments to the audience to maintain the illusion.

A human chess master was positioned inside the large, empty compartment. Hidden compartments in the cabinet enabled the chess master to conceal themselves, avoiding detection.

The chessboard on top of the cabinet was thin enough that the chess master could see the positioning of all the pieces using a network of magnets and string.

Corresponding numbers on the underneath of the chess board allowed the chess master to track an opponent’s move.

This was Artificial Intelligence 17th Century style.

Mechanical Turk 5.jpeg



2. Why use Mechanical Turk experiments?

The challenge

Imagine you’ve got a great idea for a complicated technical solution.

There’s a problem …

Building your new solution will require a lot of money, in addition to a significant commitment of time.

You don’t want to waste a whole lot of time and money on something that customers may not even want.

So, what do you do?

There are so many questions that you need to answer before you build anything.

Are my assumptions correct? Will customers use the product? How do customers use the product? Will customers potentially pay for the product?

Early feedback and learning is important.

You want to test the solution before you develop it.

“Testing customer demand and interest in a solution before development is becoming an increasingly common approach to new product development, particularly for new digital technologies and software solutions”.

The approach for a complex, technical solution shouldn’t be any different.

You should always be aiming to progressively de-risk your assumptions, while at the same time increasing your confidence levels.

If you can’t do this, then maybe your big idea isn’t so great.


The solution

To begin, you want to understand if your target customers will even use your new solution.

Is there any customer demand for your new idea? Are customers interested?

The Mechanical Turk experimentation technique is a great way to be able to achieve this objective with little investment in time or money.

Instead of building out your technology solution, you simulate the core experience of the solution using ‘human labour’.

This allows you to quickly understand if customers will use your solution, and how customers use your solution.

Early feedback is critical.

Using the Mechanical Turk experimentation technique is not about tricking or deceiving your customers.

“You’re still delivering the product or service to customers, it’s just that there’s no hardware infrastructure, integrations, and system inter-connections to support delivery of the service at this stage”.

Humans can fulfil this role in the short-term while you’re still learning.

Customers don’t need to know how you’re delivering the service, so long as they’re still extracting the promised value from your product.

The Mechanical Turk solution is purposefully non-scalable. That’s OK at this early stage.

At scale, Algorithmic and Artificial Intelligence solutions are required to keep costs down.

There’s no reason why you can’t “hide a human” behind an online form or an SMS based application before you decide to build.



3. Examples of Mechanical Turk experiments

A. IBM Speech to Text

Background and context:

One of the most popularised examples of the Mechanical Turk experimentation technique was early testing of speech to text technology by IBM.

This experiment was conducted many years ago before the invention of speech to text software as we know it today.

At the time, IBM was known for their mainframe computers and typewriters. Personal computing and the internet were still becoming a thing.

IBM had an idea … If we could combine computer technology and a typewriter it would dramatically increase the speed and efficiency of typing.

The proposed solution would allow people to speak into a microphone, with the words appearing on the computer screen, with no need for typing.

The solution would replace the need for companies employing expensive, professional typists.

Brilliant. This was going to be the next big revenue line for IBM.

IBM researchers came up with a clever way to test their hypotheses and assumptions for the speech to text solution.

How IBM tested the concept:

What customers thought was happening?

IBM setup up a workstation that contained only a monitor and microphone. There was no keyboard.

Potential customers were invited to come and try out the “new technology”.

Customers were asked to speak into the microphone. The words would then appear on the monitor screen almost instantaneously.

Potential customers were suitably impressed by the prototype product.

Screen Shot 2021-10-11 at 3.12.48 pm.png

What was actually happening?

In the room next door there was a typist listening to the speech from the microphone.

As the words were spoken by the customer, the typist would type the words using a keyboard.

Whatever the typist entered on the keyboard appeared on the computer monitor.

Customers believed that the text on the computer screen was being produced by the new speech to text technology.

What IBM learned

IBM took away some valuable learnings from this early Mechanical Turk experiment.

  1. While being initially impressed by the speech to text technology, after using the new technology customers changed their minds

  2. Using speech to translate large amounts of text into the computer proved to be a clunky and clumsy user experience

  3. Customers had a sore throat after dictating for a few hours

  4. Translating speech to text creates for a very noisy office environment

  5. Speech to text was not suitable for confidential information


For the above reasons, IBM decided not to proceed with additional R&D for the speech to text technology at the time.


B. Sainsbury’s Ecover

In 2020, the UK’s Sainsbury’s supermarkets announced a trial of Ecover refill stations.

The Ecover refill stations are dedicated dish washing and laundry detergent refill points that are in the cleaning aisles.

The initiative is part of Sainsbury’s pledge to become net zero across all its operations by 2040.

Customers have the option to refill selected Ecover cleaning products. Each bottle contains enough liquid for up to 50 washes.

Sainsbury’s estimates that the initiative has the potential to save over one million tonnes of plastic per year.

Screen Shot 2021-10-08 at 1.57.48 pm.png



What could Sainsbury’s have done differently?

Sainsbury’s could have conducted a Mechanical Turk experiment to understand if customers were interested in participating in the Ecover initiative in the first instance.

Rather than spending hundreds of thousands of dollars to develop high-fidelity refill stations, the supermarket could have quickly and cheaply constructed a lo-fi cardboard pickup and drop-off point in the cleaning aisle - even more lo-fi would be a table, chairs and a paper sign.

The pickup and drop off point could’ve been manned by members of the Sainsbury’s team.

If a customer dropped off bottles to refill their dishwashing or laundry detergent, the Sainsbury’s team members could’ve manually cleaned and refilled the bottles while customers completed their shop.

It would’ve been a great way for the Sainsbury’s team to engage closely with their early adopter customers to learn more about their needs, pains, jobs, and motivations.

Customers could’ve then picked up their refilled bottles prior to checkout and pay.

The number of customers using the refill service would be tracked over a period of days or weeks.

If customer demand data was strong enough to warrant investment in developing the concept further, Sainsbury’s could have incrementally invested further.

The approach used to validate the concept was indicative of the typical “if you build it, they will come”.

C. Testing a Smart Lift

Well, sort of.

This video goes to show how easy it is to run a fast and inexpensive experiment to see how users would react to using an intelligent, smart lift.

Two Norwegian comedians had some fun creating a “smart lift’ that responds to voice commands. :-)



D. Amazon Mechanical Turk

In 2005, Amazon decided to establish its own Mechanical Turk platform MTurk. There are now more than 100,000 workers on the platform.

The platform is a crowdsourcing website where businesses (Requesters) can hire remote workers (Turkers) to conduct on-demand tasks that computers currently can’t do.

Requestors post Human Intelligence Tasks (HIT’s) on the platform. Turkers browse and complete open jobs for a fee.

Some of the most common use cases of the MTurk platform include:

  • Processing and screening video/photos

  • Data cleaning and verification

  • Information collection (I.e. completing surveys and academic research)

  • Data processing (I.e. editing and transcribing podcasts)

E. Public Services

For many government and public sector services there are commonly advisers, experts, and support workers inbuilt into the delivery of these services.

When improving and re-designing public services there’s an opportunity to test service design, delivery, and experience.

  • Financial advice – Financial aid, student loans, debt refinancing

  • Career advice – Students, return to workforce

  • Social services – Housing, healthcare, counselling

  • Legal – Legal aid, legal counselling

  • Public transport – Information services

  • Military – Recruitment, counselling, repatriation

4. Challenges with Mechanical Turk experiments

Having undertaken Mechanical Turk experiments in large enterprise there can sometimes be a few challenges with executing these types of experiments.

My below thoughts are framed from an enterprise perspective.

Some of the key challenges include:

 

Time:

Mechanical Turk experiments can be more time consuming and involved. There’s often a lot more moving parts that need to be considered for these types of experiments as you’re cobbling together backend business processes with duct tape and band aids.

Stakeholder engagement:

The level of stakeholder engagement can be much higher. In a large business there are often multiple teams involved in successfully executing Mechanical Turk experiments.

Scalability:

Mechanical Turk experiments are not scalable in the same way as A/B testing, for example. It’s difficult to have lots of Mechanical Turk experiments running in parallel due to the high-touch, high complexity nature of these experiments.

Governance:

In effect, a Mechanical Turk experiment can feel a bit like a mini product launch as you’re delivering the product or service to your customers. Consequently, you’ll need to ensure that you’re adhering to Legal, Risk, Regulatory and Compliance protocols.

Operations:

The Operations teams are the ones who typically pick up much of the “human effort” to help product teams learn. Don’t be unreasonable with your expectations. Make their life as easy as possible.

Duration:

It’s important to time-box the length of the experiment. Be clear on your learning objectives. When your learning objectives have been met, shut the experiment down. You don’t want to be left delivering a non-scalable, high-touch, manual solution indefinitely.

 

While Mechanical Turk experiments can be more involved than other experimentation techniques, it’s worth the effort.

The upside certainly outweighs any execution challenges.

For a little bit of time and money, it can potentially save your business tens of millions of dollars in wasted budget.

Conclusion

Building complex technical solutions takes a lot of time and money.

Before you start building anything, you want to be sure that customers will use your solution.

Use the Mechanical Turk experimentation technique to learn faster and de-risk development of your idea.

This experimentation technique allows you to test your solution before you build it.

Customers are presented with a technical solution, while “human intelligence” works behind the scenes to simulate complex technology.

Customers are left with the impression that they’re interacting with the real deal.

The Mechanical Turk experimentation technique allows you to get early feedback on your solution from customers.

Learn if customers will use your new solution, and how they use it.

Make the effort to validate your assumptions before you build anything.










Need help with your next experiment?

Whether you’ve never run an experiment before, or you’ve run hundreds, I’m passionate about coaching people to run more effective experiments.

Are you struggling with experimentation in any way?

Let’s talk, and I’ll help you.


References:

Wiki, Sainsbury’s, Amazon

Before you finish...

Did you find this article helpful? Before you go, it would be great if you could help us by completing the actions below.


By joining the First Principles community, you’ll never miss an article.


Help make our longform articles stronger. It only takes a minute to complete the NPS survey for this article. Your feedback helps me make each article better.



Or share this article by simply copy and pasting the URL from your browser.


Previous
Previous

United Nations : Solving The World’s Toughest Challenges With Experimentation

Next
Next

How To Unlock Strategic Success With Experimentation