Topics

Topics

Business and investment Consumption and prices Education and skills Employment and income Government finances and spending Health and social care International development Poverty, inequality and social mobility Research methods Savings, pensions and wealth Taxes and benefits
Election 2024 Levelling up Coronavirus Student finance NHS waiting lists Scotland

Help me find something
Research and analysis
Research and analysis
Reports

Reports

Featured content
The Mirrlees Review

Comment

Working papers

Journals

Press releases

Presentations

Fiscal Studies

Data

Fiscal facts

Election 2024
Help me find something
Podcasts, explainers and calculators
Podcasts, explainers and calculators
Podcasts

Explainers

Calculators and tools

Calculators and tools

Additional resources
Fiscal facts Guides, manuals, software and more

Election Special: Your questions answered
podcast
In the final run up to the election, we answer your questions about everything from borrowing to AI to the part that luck plays in the economy.
27 June 2024
Election Special: The Labour manifesto explained
podcast
We discuss the Labour manifesto launched on Thursday and give our reaction to the policies and pledges within it.
14 June 2024
Election Special: The Conservative manifesto explained
podcast
We discuss the Conservative manifesto launched on Tuesday and give our reaction to the policies and pledges within it.
12 June 2024
Additional resources
Guides, manuals, software and more Fiscal facts: public finances, tax and benefits
Help me find something
Events
Events
Events

Seminars

Annual lectures

Past presentations

General Election 2024: IFS manifesto analysis
event 24 June 2024
IFS researchers and Director Paul Johnson will deliver their analysis of the parties' manifestos at a live-streamed press briefing.
Inaugural IFS postdoc workshop
workshop 4 July 2024
This academic workshop on applied public and labour economics will feature presentations by alumni of the IFS postdoc scheme.
Living standards, poverty and inequality in the UK: 2024
event 25 July 2024
This online event will present the key findings from our latest flagship annual report on living standards, poverty and inequality in the UK.
Help me find something
About
About
People

People

Research fellows

Research Associates

Communications, finance and administration

Visitors

Announcements

Governance

How we are funded

Impact

Jobs

Press office

Contact

Jobs
landing page
At IFS, we recruit and train top-quality economists and professional support staff. We aim to foster a respectful and inclusive working environment.
20 July 2022
Centre for the Microeconomic Analysis of Public Policy
The Research Centre at the heart of IFS is the CPP.
Help me find something

Optimal Data Collection for Randomized Control Trials

Published on 2 May 2019

In a randomized control trial, the precision of an average treatment eﬀect estimator and the power of the corresponding t-test can be improved either by collecting data on additional individuals, or by collecting additional covariates that predict the outcome variable. To design the experiment, a researcher needs to solve this tradeoﬀ subject to her budget constraint. We show that this optimization problem is equivalent to optimally predicting outcomes by the covariates, which in turn can be solved using existing machine learning techniques using pre-experimental data such as other similar studies, a census, or a household survey. In two empirical applications, we show that our procedure can lead to reductions of up to 58% in the costs of data collection, or improvements of the same magnitude in the precision of the treatment eﬀect estimator.

In a randomized control trial, the precision of an average treatment eﬀect estimator and the power of the corresponding t-test can be improved either by collecting data on additional individuals, or by collecting additional covariates that predict the outcome variable. To design the experiment, a researcher needs to solve this tradeoﬀ subject to her budget constraint. We show that this optimization problem is equivalent to optimally predicting outcomes by the covariates, which in turn can be solved using existing machine learning techniques using pre-experimental data such as other similar studies, a census, or a household survey. In two empirical applications, we show that our procedure can lead to reductions of up to 58% in the costs of data collection, or improvements of the same magnitude in the precision of the treatment eﬀect estimator.

Authors

Sokbae "Simon" Lee

Sokbae (Simon) Lee

Research Fellow Columbia University

Sokbae is an IFS Research Fellow and a Professor at Columbia University, with an interest in Econometrics, Applied Microeconomics and Statistics.

Pedro Carneiro

Research Fellow University College London

Pedro is a Professor of Economics at University College London and an economist in the IFS' Centre for Microdata Methods and Practice (cemmap).

Daniel Wilhelm

Research Associate LMU Munich

Daniel is a Research Associate of the IFS in Cemmap and Professor of Statistics and Econometrics at LMU Munich.