Pre-Conference Workshops

Aarhus University

In advance of the main IRC meeting, IEA offers five optional, two-day workshops on specialized topics related to large-scale assessment. The workshops are offered in parallel from 24–25 June 2019.

The aim of the workshops is to provide a stimulating and practical learning environment for all those wishing to improve their understanding of, and gain practice in, working with data from large-scale international assessments such as those conducted by IEA. Each workshop varies in its focus and level; the specific topics and prerequisites are described in more detail below. Lectures and discussions will be conducted in English.

The workshop participation fee is 100 euros. Attendance is limited to 25 participants per workshop. Participants must register before 24 May 2019 to secure a place (registration is on a first-come, first-served basis). Registration is open from 3rd December 2018.

IEA conducts regular workshops at international research conferences and in collaboration with the IERI Institute ( For more information, consult our training page.

1) Using large-scale assessment data to inform policy and practice
Professor David Rutkowski and Dr Sabine Meinck

Over the last 60 years, modern international large-scale assessments (ILSAs) have become influential educational policy tools, moving beyond their historical role as descriptive “snapshots” of educational systems. As both the numbers of participants involved and subjects assessed by ILSAs have increased, policymakers have become increasingly interested in how the data collected can help inform the policy. It is thus important to present policy-relevant information resulting from ILSAs in a brief and accessible format. This workshop will discuss how ILSA data can be used to inform policymakers and education practitioners. The instructors will provide a solid overview of how to construct a good policy brief, illustrated by practical examples. Participants should leave the workshop with a well-developed outline that they can later develop into a publishable document.

The workshop will focus on the most recent cycles of ICCS and PIRLS, with an emphasis on the following key topics:

  • Challenges and solutions in the construction and development of educational briefs that use ILSA data (defining and addressing the audience, structure and length, tables and graphics);
  • General information about PIRLS and ICCS, including the goals, purposes and intent, theoretical frameworks, target populations, achievement domains and background information collected by these key IEA studies;
  • Introduction to basic statistical analyses using the IEA IDB Analyzer, with the goal of producing relevant research data to inform policy;
  • Interpretation and discussion of findings, and how to translate research into a language accessible to the defined audience.

This two-day workshop comprises a mix of both lectures and group work. Participants will have ample time for group discussion of possible topics for briefs, and to identify and develop research analyses of core relevance to their personal research interests. After a comprehensive introduction to the IEA IDB Analyzer, its application will be demonstrated using practical example analyses. Using simple tools that can handle the methodological issues automatically, participants will gain experience in working with IEA data and a better understanding of the possibilities and constraints for analyses. Participants will inspire each other by exchanging ideas on nationally-focused analysis, while the instructors will provide guidance and advice throughout on further possibilities or potential limitations. The workshop will showcase successful examples of how to analyze the data from IEA studies and communicate the results to policymakers and education practitioners.


Participants should possess a working knowledge of basic statistics, and will need to bring their own laptop PC with Microsoft Office and SPSS 16.0 or higher preinstalled.

Dr David Rutkowski is an associate professor at Indiana University, USA. He previously worked at the Center for Educational Measurement (CEMO) at the University of Oslo, Norway, and has also worked for the IEA's Research and Analysis Unit in Hamburg. He earned his PhD in educational policy with a research specialization in evaluation from the University of Illinois at Urbana-Champaign. David is especially interested in how large-scale assessments in education are used to support and inform policy debate. He is a serving member of the IEA's Publications and Editorial Committee and current Editor of the IEA Compass: Briefs in education series. 

Dr Sabine Meinck brings a wealth of experience to this workshop, being the Head of both the IEA’s Research and Analysis Unit and its Sampling Unit. For the last decade, she has been involved with sampling, weighting and variance estimation for all of IEA's large-scale assessments. Her main research interests lie with the methodological challenges of complex survey data, and in the best dissemination of the results of IEA studies beyond the core research audience. Sabine is also an Associate Editor for the IERI journal Large-scale Assessments in Education.

2) Multilevel modeling with IEA data
Dr Maria Magdalena Isac and Professor Leslie Rutkowski

Multilevel modeling (MLM) can be used to reflect the hierarchical structure of education systems and the clustered structure of the data. MLM usually provides more reliable and less biased results than traditional analysis methods for clustered data. This workshop is designed to introduce participants to the basic theory and application of multilevel modeling (MLM), focusing especially on those features that are particular to large-scale assessment data (such as weighting and scaling). Participants will gain basic practical experience of the application of two-level models to large-scale assessment data.

The workshop begins with a methodological introduction to MLM and its underlying assumptions. Participants will receive practical training in how to use MPlus software for multilevel analysis of IEA data, and learn more about methods for model selection and hypothesis testing. The course considers methodological concepts related to the complex study and sampling design of large-scale assessments, and the course instructors will provide comprehensive advice on selecting the most appropriate multilevel modeling approach for data analysis.

The following topics will be covered:

  • Methodological foundations of MLM;
  • Two-level random coefficient models with L1 and L2 predictors;
  • Short introduction to MPlus;
  • Calculating the compositional effect;
  • Centering approaches;
  • Application of MLM to large-scale assessment data (incorporating weighting and plausible values);
  • Hypothesis testing and model selection.

The models will be presented, and workshop participants will then practice their implementation via a series of practical exercises using MPlus.


This workshop is aimed at individuals who already possess a working knowledge of large-scale assessment and a solid knowledge of intermediate statistics. Although no previous experience of MPlus is required, familiarity with syntax-based statistical software is beneficial. Participants must bring their own PC-compatible laptops with SPSS software (or similar alternative software that can be used for data preparation) preinstalled. A trial version on the MPlus software will be made available and used during the workshop. The workshop will be a mixture of lectures and hands-on training, to ensure participants gain both sound knowledge and practical expertise.

About the instructors:

Dr Leslie Rutkowski is Associate Professor of Inquiry Methodology at Indiana University, USA, and prior to that worked at the Centre for Educational Measurement (CEMO) at the University of Oslo. She earned her PhD in educational psychology, specializing in statistics and measurement, from the University of Illinois at Urbana-Champaign. Leslie’s research interests include latent variables and examining methods for comparing heterogeneous populations in international surveys. In addition to a recently funded FINNUT grant to develop international assessment methods, Leslie was one of the editors of the well-received Handbook of International Large-Scale Assessment (Rutkowski, von Davier, & Rutkowski, 2014), and is currently writing a book on working with international assessment data. Leslie is also one of the Editors of the hugely successful IEA Research for Education handbook series, and an Associate Editor for the IERI journal Large-scale Assessments in Education.

Dr. Maria Magdalena Isac is a senior research consultant for the International Large-Scale Assessments Unit of the Italian Institute for the Evaluation of the Education System (INVALSI). Prior to this assignment, she was a researcher on citizenship education and educational effectiveness at the University of Groningen, The Netherlands, and she held research positions on the same topics at the European Commission and the University of Bologna. She earned her PhD in Behavioural and Social Sciences at the University of Groningen, The Netherlands, where she specialized in educational effectiveness and citizenship education. In her work, Magda analyses international large-scale assessment data (such as ICCS, PIRLS and TIMSS) using various methodological approaches, including multilevel modelling and latent variable modeling techniques 

3) Assessment design, item response theory, and proficiency estimation
Dr Eugenio Gonzalez

This workshop provides an overview of the principles surrounding the design of large-scale assessments, the item response theory (IRT) models used to calibrate items, and the methodology used to assign proficiency estimates, also known as plausible values. Most of the workshop will be devoted to theoretical presentations, but there will be ample time allocated for discussion. At the end of the workshop, participants undertake practical example analyses with the software used in IEA studies.

Topics covered during the workshop will include:

  • Assessment design principles: examining both the advantages and disadvantages of current designs of large-scale assessments, and consequences for analysis;
  • Principles of item response theory: including the advantages and disadvantages of different models, the limitations of IRT models, and using Parscale to compute item parameters and proficiency estimates; and
  • Principles of population modeling and proficiency estimation: including review of procedures and techniques for carrying out conditioning, multiple imputations in large-scale assessments, plausible values and why they are useful, using DESI (Direct Estimation Software Interactive) to compute proficiency estimates and plausible values.


This workshop is aimed at individuals who already possess a working knowledge of large-scale assessment and some understanding of statistics. Participants must bring their own PC-compatible laptops with SAS or SPSS software preinstalled. Trial versions of software used for the practical examples will be provided during the workshop.

About the instructor:

Dr Eugenio Gonzalez is a Principal Research Project Manager at Educational Testing Service (ETS) in the USA, and the director of the IEA-ETS Research Institute (IERI), a collaborative effort between the IEA and ETS that focuses on improving the science of large-scale assessments. Eugenio was formerly the Director of Quality Control and Field Operations for the National Assessment of Educational Progress (NAEP), and Director of International Operations and Data Analysis at the IEA's TIMSS & PIRLS International Study Centre at Boston College. He gained his PhD in educational research, measurement and evaluation while at Boston College. Eugenio is also Editor-in-Chief of the IERI journal, Large-scale Assessments in Education.

4) Response styles in large-scale assessment
Dr Jamis He Jia (in memory of Dr Fons van de Vijver)

This workshop addresses response styles in large-scale assessment. Attitudinal and behavioral self-reports, the common item types, are known to be influenced by response styles, such as acquiescence, midpoint responding, extremity responding, and social desirability.

The workshop will start with an introduction in the topic, addressing the following issues:

  • Response styles in historical perspective: styles as nuisance factors, models of social desirability;
  • Direct and indirect assessments of response styles;
  • Correlates of response styles at individual level;
  • How many response styles are there? Towards an integrative view;
  • The cross-cultural perspective on response styles: distortion or communication style?;
  • Correlates of response styles at country level;
  • Procedures for reducing response styles (such as Likert versus frequency scales, response standardization, anchoring vignettes, and statistical response style corrections); and
  • Influence of corrections on factors structures and individual/cultural differences.

After this introduction, participants will be shown how to use an IEA data set to interactively address the topics, and will learn the answers to such questions as:

  • How can we compute response style indices? Which response styles can typically be computed in IEA data sets?
  • How large are individual and country differences in these styles?
  • Are the response styles correlated at individual and country level? Can these be integrated?
  • What are the correlates at individual level?

Internet sources of country data will be referenced, enabling correlates at country level to be computed.


SPSS and Mplus software will be used (other packages with similar functionality can also be used). It is encouraged that participants bring their own laptop that can be used to conduct all computations in the second part of the workshop.

Suggested preparatory reading:

Van de Vijver, F. J. R., & He, J. (2016). Bias assessment and prevention in non-cognitive outcome measures in PISA questionnaires. In S. Kuger, E. Klieme, N. Jude, & D. Kaplan (Eds.), Assessing contexts of learning world-wide. An international perspective (pp. 229-253). New York, NY: Springer. doi:10.1007/987-3-319-45357-6_9
He, J., & Van de Vijver, F. J. R. (2015). Effects of a general response style on cross-cultural comparisons: Evidence from the Teaching and Learning International Survey. Public Opinion Quarterly, 79, 267-290. doi:10.1093/poq/nfv006
He, J., & Van de Vijver, F. J. R. (2013). A general response style factor: Evidence from a multi-ethnic study in the Netherlands. Personality and Individual Differences, 55, 794-800. doi:10.1016/j.paid.2013.06.017
He, J., Van de Vijver, F. J. R., Fetvadjiev, V. H., Dominguez-Espinosa, A., Adams, B. G., Alonso-Arbiol, I., Aydinli-Karakulak, A., Buzea, C., Dimitrova, R., Fortin Morales, A., Hapunda, G., Ma, S., Sargautyte, R., Schachner, R. K., Sim, S., Suryani, A., Zeinoun, P., & Zhang, R. (in press). On enhancing the cross-cultural comparability of Likert-scale personality and value measures: A comparison of common procedures. European Journal of Personality.

About the instructor:

Jia He is a Marie Skłodowska-Curie fellow in DIPF Leibniz Institute for Research and Information in Education in Germany and a post-doc researcher in the Department of Methodology and Statistics in Tilburg University (the Netherlands). She obtained her Ph.D. under the supervision of Fons van de Vijver in cross-cultural psychology in Tilburg University, the Netherlands. Her current research includes data comparability and validity with innovative designs of item formats and sophisticated psychometric methods in large-scale international educational surveys. She is also interested in modern research methods such as structural equation modeling, multilevel analysis, and Bayesian statistics

5) Analyzing data from international large-scale assessments using R
Dr Emmanuel Sikali, Dr Paul Bailey and Dr Ting Zhang

Data from international large-scale assessment programs such as TIMSS are valuable resources for researchers studying issues related to education, and such data may be used to inform educational policy. Due to the scope, complexity and special features of large-scale assessment data, researchers often have to use several different software tools to access, manipulate, and analyze the data. Mindful of this, the US National Center for Education Statistics (NCES) has developed an R package called EdSurvey, designed specifically for the analysis of national and international large-scale assessment data. The primary purpose of this workshop is to introduce participants to the currently available functionality of EdSurvey and explain how this free-to-download R package may be used to analyze TIMSS and TIMSS-like data. 

This 1.5-day course will comprise both lectures and practical exercises. The instructors will begin by introducing participants to the unique design features of international large-scale assessment data, providing guidance on the data analysis strategies that researchers can employ, including the selection and use of appropriate plausible values, sampling weights, and variance estimation procedures (such as the jackknife approach).

Following this introduction, participants will undertake interactive, practical training in analyzing public-use TIMSS data files using the R package EdSurvey. Participants will learn about:

  • R basics, including getting to know the R environment and R language, and package installation;
  • Data processing, including downloading publicly available data and reading data in R;
  • Data manipulation, including subsetting and merging data, and renaming and recoding variables;
  • Cross-tabulation, including unweighted and weighted totals, conditional means, and the percentage of respondents in a category (conditional on an ancillary categorical variable or on the interactions of an arbitrary number of categorical variables), and estimation of scale score means based on plausible values;
  • Calculating the percentiles of a numeric variable or plausible values;
  • Analysis of benchmarks for international assessment data;
  • Correlations, including Pearson, Spearman, polyserial, polychoric, and correlation between plausible values, with or without weights applied;
  • Linear regression, with or without plausible values as the dependent variable;
  • Logistic regression that allows either a discrete variable or dichotomized plausible values as the dependent variable; and
  • Gap analysis, comparing the average, percentile, achievement level, or percentage of survey responses between two groups that potentially share members.

The theoretical knowledge and analytic approach learned during this course may be applied to analyze other international large-scale assessment data with plausible values, such as TIMSS Advanced, PIRLS, ICILS, ICCS, and PISA data.


Participants should have basic knowledge of statistical techniques, including statistical inference and multiple regression analysis. It is helpful, though not requisite, to have some working knowledge of item response theory and sampling theory, as well as some familiarity with R software (e.g., participants may have already completed an entry-level training on R programming, or used R for basic programming, such as simple data cleaning or running a regression).

As this workshop includes extensive practical content, participants must bring their own laptops with the latest version of R and R studio preinstalled.

About the instructors

Dr Emmanuel Sikali is a senior data scientist with NCES inside the Institute of Education Science (IES) within the US Department of Education, where he is responsible for research and development. In this capacity, Dr. Sikali is the project officer for the development of EdSurvey, an R package that analyzes data from NCES, such as NAEP and international assessment data. Dr Sikali has been with NCES since 2005, and has a PhD in information technology and engineering from George Mason University in Fairfax, Virginia.

Dr Paul Bailey is a senior researcher at the American Institutes for Research (AIR). Dr Bailey is the lead developer for the EdSurvey package and has developed several R packages that are collectively downloaded more than a thousand times per month. He has also worked in the areas of labor economics and the return to the Post-9/11 GI Bill, econometrics for error in variables models, and value-added modeling. Dr Bailey has been at AIR since 2012, and has a master’s in statistics from the University of Chicago and a PhD in economics from the University of Maryland.

Dr Ting Zhang is a senior psychometrician and statistician at the American Institutes for Research (AIR). She serves as project director for the development of the EdSurvey package. Dr Zhang has been the principal investigator of two other research projects studying validity issues pertaining to technology-based assessments. She has been at AIR since 2013, and has a master’s in educational psychology from Texas Tech University and a PhD in human development and quantitative methods from the University of Maryland. Dr Zhang’s research interests focus on complex survey design and measurement validity.