Split Questionnaire Designs as a clever way to make surveys shorter

Long questionnaires are a challenge for respondents and researchers alike. Split Questionnaire Designs (SQDs, Raghunathan and Grizzle, 1995) offer a clever way to make surveys shorter: instead of answering every question, respondents receive only parts of the full questionnaire. But how these parts—or “modules”—are constructed makes a difference for data quality. Our study tests strategies to balance the perspective of questionnaire design with accurate statistical estimation.
Lange Fragebögen sind eine Herausforderung – sowohl für Befragte als auch für Forschende. Split Questionnaire Designs (SQDs) verkürzen Fragebögen, indem Befragte nur Teile des vollständigen Fragebogens beantworten. Wie diese Teilfragebögen – sogenannten Module – konstruiert werden, beeinflusst jedoch die Qualität der Schätzungen. Unsere Studie testet Strategien, um die Perspektive des Fragebogendesigns mit einer akkuraten statistischen Schätzung in Einklang zu bringen.
DOI: 10.34879/gesisblog.2025.111
When Shorter Surveys Mean More Missing Data
Imagine you’re designing a large-scale survey. You want to capture opinions on a wide variety of topics—politics, social values, digital behavior—but every additional question increases respondent burden. This can have negative consequences not only for the respondent experience, but also for response rates, survey costs, and measurement error. Especially when conducting online surveys, you quickly face the need to shorten your questionnaire.
Split Questionnaire Designs seem to offer a way out: instead of giving every participant the full questionnaire, you distribute its parts across different modules and randomly assign some of them to each respondent. As a result, each respondent answers fewer questions, but together, the responses cover the entire questionnaire.
This convenience, however, comes at a price: many data points are now intentionally missing. To make the data analyzable—especially for multivariate analyses—these planned missing values must later be imputed statistically. And here begins the core challenge: how to split the questionnaire in a way that keeps both respondents and data analysts happy.
The Dilemma: Questionnaire Perspective vs. Imputation Perspective
From a statistical perspective, highly correlated variables should be separated into different modules.1 In practice, this can be approximated, for example, by randomly allocating questions to different modules.2 This helps imputation algorithms reconstruct the missing data more accurately.
This can be explained by a short example: Think of two variables that are strongly associated with each other—for example, concerns about climate change and support for environmental policies. If both variables are placed in different modules, one of them will be observed for many respondents when the other is missing, providing useful information to impute the other (assuming that there are enough pairwise observations realized). But if they are placed in the same module, they will always be observed or missing together, leaving the imputation algorithm with little information to work with.
From a questionnaire design perspective, though, this idea can seem odd. As survey methodologists Smyth3 and Krosnick & Presser4 note, grouping thematically related questions can help respondents recall relevant information and maintain attention. Therefore, a faster pace of topic changes and some questions being left out within each topic might make the questionnaire appear inconsistent or confusing to respondents, putting the effectiveness of the measurement instrument into question. Thus, questionnaire designers may prefer to allocate all questions on a given topic to the same module.
In other words, what’s best for the quality of the imputed data might not be best for the questionnaire. The art of split questionnaire design lies in this tension between statistical efficiency and questionnaire coherence.
A Core to Hold It All Together
To reconcile these perspectives, researchers sometimes implement an extended core module into their Split Questionnaire Design—a set of questions that everyone receives. The rest of the questions are distributed across different modules, of which only a random subset is administered to each respondent. For example, such an extended core module strategy combined with single-topic modules has been implemented in the European Values Study.5
Our study examined whether extending this core with the most informative variables from the different survey topics could help single-topic modules perform as well as random modules. The idea is simple: if the core includes key variables that are strongly correlated with others, perhaps we can enjoy the benefits of thematic coherence and reliable imputation.
We tested this hypothesis through a Monte Carlo simulation based on data from the German Internet Panel,6 7 comparing four strategies:
- Single-topic modules (STM) with a simple core (only sociodemographics)
- Single-topic modules with an extended core (including correlated variables from the different survey topics)
- Random modules (RM) with a simple core
- Random modules with an extended core
What the Simulations Reveal
The results were clear. For single-topic modules, extending the core improved the quality of imputed estimates—the deviations from the benchmark were smaller. However, even with this improvement, single-topic modules did not outperform random module allocation. For random modules, the extended core made little to no difference.
In short:
- Extended cores help single-topic modules, making their estimates more reliable.
- Random modules still perform best overall, irrespective of whether an extended core is included.
In other words, while an extended core can make single-topic modules less problematic, random allocation remains the most robust solution from an imputation perspective.

Source: Axenfeld, J. B., Bruch, C., & Wolf, C. (2025). Composition of core modules and item allocation in split questionnaire designs: impact on estimates from imputed data. International Journal of Social Research Methodology, 1–24. https://doi.org/10.1080/13645579.2025.2561653
Balancing Questionnaire Consistency and Data Quality
For practitioners designing large-scale surveys, these results offer guidance on how to navigate the crucial trade-off between questionnaire consistency and the quality of the imputed data. From a purely statistical perspective, random modules are clearly the best option available. This strategy also has the advantage that no extended core module is needed, which helps keep the questionnaire short.
Yet, when priority is given to a slower pace of questionnaire topics and avoiding potential disruption through certain questions being left out within each topic, single-topic modules with an extended core may offer a compromise. However, this strategy falls behind random modules in terms of imputed data quality and results in longer questionnaires due to the extended core module presented to every respondent.
It should also be noted that distributing survey topics across different modules does not necessarily imply that topics are spread throughout the entire questionnaire or that question order changes. Even with random modules, questions can be presented in their original order—only those from non-assigned modules are omitted. Hence, the potential advantages of single-topic modules in terms of questionnaire consistency mainly lie in the slower pace of topic changes and the absence of skipped questions within a topic.
Looking Ahead
Future research could explore how single-topic modules versus random modules—as well as extended core modules—affect response behavior in practice. Can the statistical drawbacks of single-topic modules be offset by higher response quality due to their thematic coherence? Do the more abrupt topic changes in random modules irritate respondents, or does their diversity help prevent boredom? And perhaps most importantly, how do these strategies affect respondent burden and response rates?
Main Reference
Axenfeld, J. B., Bruch, C., & Wolf, C. (2025). Composition of core modules and item allocation in split questionnaire designs: impact on estimates from imputed data. International Journal of Social Research Methodology, 1–24. https://doi.org/10.1080/13645579.2025.2561653
References
- Raghunathan, T. E., & Grizzle, J. E. (1995). A split questionnaire survey design. Journal of the American Statistical Association, 90(429), 54–63. https://doi.org/10.1080/01621459.1995.10476488
- Axenfeld, J. B., Blom, A. G., Bruch, C., & Wolf, C. (2022a). Split questionnaire designs for online surveys: The impact of module construction on imputation quality. Journal of Survey Statistics and Methodology, 10(5), 1236–1262. https://doi.org/10.1093/jssam/smab055
- Smyth, J. D. (2016). Designing questions and questionnaires. In C. Wolf, D. Joye, T. W. Smith, & Y.-C. Fu (Eds.), The SAGE Handbook of survey methodology (pp. 218–235). SAGE Publications.
- Krosnick, J. A., & Presser, S. (2010). Question and questionnaire design. In P. V. Marsden & J. D. Wright (Eds.), Handbook of survey research (pp. 263–314). Emerald Group Publishing Limited.
- Luijkx, R., Jónsdóttir, G. A., Gummer, T., Ernst Stähli, M., Frederiksen, M., Ketola, K., Wolf, C., Brislinger, E., Christmann, P., Gunnarsson, S. Þ., Hjaltason, Á. B., Joye, D., Lomazzi, V., Maineri, A. M., Milbert, P., Ochsner, M., Pollien, A., Sapin, M., Solanes, I., … , and Wolf, C. (2021). The European Values Study 2017: On the way to the future using mixed-modes. European Sociological Review, 37(2), 330–346, https://doi.org/10.1093/esr/jcaa049.
- Blom, A. G., Gathmann, C., & Krieger, U. (2015). Setting up an online panel representative of the general population: The German Internet panel. Field Methods, 27(4), 391–408. https://doi.org/10.1177/1525822X15574494
- Cornesse, C., Felderer, B., Fikel, M., Krieger, U., & Blom, A. G. (2021). Recruiting a probability-based online panel via postal mail: Experimental evidence. Social Science Computer Review, 40(5), 1259–1284. https://doi.org/10.1177/08944393211006059
Leave a Reply