Guidelines for Planning Effective Surveys
Release date: March 2000
This is one of a series of guides for research and support staff involved in natural resources projects. The subject-matter here is the planning of effective surveys. Other guides give information on allied topics. Your comments on any aspect of the guides would be welcomed.
Contents
1. Introduction
1.1 Aim
1.1 Aim
This booklet provides guidelines about surveys which may be contemplated as part of the programme of project work. It is intended to provide guidance for those involved in natural resources research or bilateral projects, and to help those who review project activities. They should be able to recognise where a survey may be needed to achieve stated objectives of the project and to check that a survey is realistically formulated.
Until about 15 years ago surveys were often automatically included in research projects and many did not meet expectations. The pendulum has now swung away from survey methods and they are sometimes avoided, even when there is no other technique that can provide equivalent information.
1.2 What do we mean by surveys?
A survey is a research process in which new information is collected from a sample drawn from a population, with the purpose of making inferences about the population in as objective a way as possible. Most surveys use questionnaires as the means by which data are acquired, but a survey should not be seen as just a questionnaire plus its responses, rather as a fully integrated part of the research strategy. We deal in these guidelines with the stages involved in deciding whether a survey is appropriate for a project, and then with planning, design, implementation and analysis of surveys.
1.3 The project planning stage
Research, or the planning phase of a development project, often begins with the formulation of a problem - a constraint to development. The early stages of work then entail elaboration of the problem. This includes collecting information about the setting and the issues; involving all the relevant people, not just the powerful and vocal; facilitating the expression and examination of relevant ideas, often by qualitative and participatory research methods.
In some instances, consensus and an agreed synthesis can be achieved on the basis of information already available from reliable sources and from the qualitative work. If the outcome of this initial work is a clear insight of immediate application, there will be no valid case for gathering new, primary information: the need is for dissemination of results, maybe to farmers, so they apply the methods, or to politicians so they create the required environment for the development.
In other cases, the initial planning stage will identify a need for more information, first to develop solutions and later, once those solutions have been implemented, to assess their impact. In addition, sponsors may reasonably require objective verification that the resulting ideas and methodology have been developed to the point where they can be exploited with confidence in a wider recommendation domain - all or part of an overall "population". This often requires a more quantitative phase of work later in the project. For example, an elaboration exercise may have taken place in a narrow setting: a single community, and a single research team's experience, may lead to an apparently satisfactory outcome at local level. It would be desirable to determine how much can be said about the wider population on the basis of this exercise.
[TOP]
2. Is Survey Methodology Appropriate?
2.1 Is there a place for surveys in your project?
In the initial stages of a project, baseline information may be found from many sources; for example some indication of village sizes may be available from recent aerial photographs. Surveys can be relevant as part of baseline studies, when there is insufficient basic information about target communities. They are also carried out to provide a benchmark against which the impact of a project can be assessed. This type of survey may be needed in settings which lack well-developed systems for routine data collection such as agricultural census or health monitoring, or when information available does not cover the subject of interest in enough detail.
Done at an early stage in a project, the benchmark survey should not attempt prematurely to preempt the subsequent project work by an ill-prepared effort to collect a wide range of information, but should collect clear information on a small range of items, which it is expected will be recorded again when the project has made an impact. The very justification for starting a project should suggest key measures on which an impact is anticipated. If impact can be expressed in terms of measurable changes in a few such indicators, impact assessment should be planned into the project. If either the "before" or the "after" information is poor, the measurement of change will be correspondingly weak, so real care and effort are justified to collect a limited range of "before" information to a good standard.
In the later stages of a project, surveys are also relevant if research hypotheses or field solutions have been suggested and developed but lack comprehensive and conclusive supporting evidence, or if there is uncertainty about the range of settings where they apply. Survey methods are of the essence after a process of qualitative work has defined needs for broadly-based population information, after stakeholder participation has already had its input to the selection of issues to be addressed in the survey, and when a broad and consistent picture is needed.
DFID projects should produce clear-cut, sophisticated outputs which can be objectively validated as to quality, relevance and usefulness. In a substantial project, we believe it is obvious that there is a clear place for both qualitative and quantitative elements. Producing demonstrably objective evidence is the strength of the survey method.
2.2 Strengths and weaknesses of surveys
On the plus [+] side, surveys have much to contribute, but only if well thought out. On the minus [-] side a survey can represent substantial effort within a project. Many failures, blamed on the survey method, are due primarily to attempts to take short-cuts. This usually happens where surveys are poorly conceived or managed, or where there are unrealistic expectations of what data can be collected.
[+] A well-organised survey can achieve breadth of coverage with maybe hundreds of respondents so that the wide range of characteristics e.g. types of livelihoods, lifestyles, situations, ethnicities, knowledge, attitudes, and practices in the population can all contribute to the overall picture.
[+] With a well-thought-out sampling procedure, we can stratify to ensure proper coverage of the major characteristics.
[+] With good methodology and an adequate sample size we can expect reasonable representativeness through coverage of important factors that differentiate sections of the population, even if these were not specifically controlled in the sampling scheme.
[+] The sampling procedure, for a worthwhile survey, will involve such elements of randomness as provide an assurance of objectivity. Respondents should be selected according to a procedure which will not give "unfair" prominence to particular groups who may be untypical, e.g. villagers who have cooperated with the researcher previously, whose livelihoods and attitudes have therefore been altered.
[+] If the survey follows on from smaller-scale, though more intensive, qualitative work it may serve to replicate earlier findings on a wider scale, or delimit their range of applicability. If project results are to be of more than local and ephemeral value, it is vital to show their general relevance. An adequate-sized, objectively selected and representative sample is the best basis for a claim to generalisability.
[+] Surveys frequently provide enough data to allow reliability to be assessed in various ways, e.g. analyses demonstrating consistency of findings across subsets of respondents lend support to the contention that the questions are being answered in a systematic and interpretable way. Repeated surveys may show consistent and interpretable patterns of results over time.
[+] A few results from a survey can often be checked against respected independent sources to provide a measure of concurrent validity. If they agree, we have some basis for saying both must be measuring the right thing and correctly, and by extension can claim some added plausibility for other survey items!
[+] If suitably structured, the survey can (i) take account of varying sizes of units, e.g. farms, and (ii) correct for under-enumeration and some sorts of non-response.
[-] Sound surveys require considerable time and effort to plan and run; naive managers often do not think carefully through the tasks entailed; they underestimate time and resources needed, especially for computerisation and analysis.
[-] Unless a survey has a target length, a crisply-defined purpose and a predetermined analysis plan, it tends to grow by accretion and become unwieldy and unrewarding.
[-] Conducting a survey before the project staff have an agreed and settled idea of what is wanted will produce results which are irrelevant by the time they become available. Ill-phrased questions, poorly linked to objectives, lead to results that are neither digestible nor informative. Examples of this type represent a failure of survey management not of the method!
2.3 What are the alternatives to a survey?
In a survey we attempt to gather a relatively small amount of information from a large number of respondents. This implies that informants are not well known to the members of the project team. Indeed, some advantages of the survey method - objectivity of sampling and confidentiality for the respondents - arise from the interest in the population from which informants are a sample, rather than a personal interest in any particular individual or community.
Sometimes, resources may not be sufficient for us to seek information that is sufficiently general, but a cheaper method will give some useful data. For example a survey on household size might be replaced by aerial photos, using area of each compound as a proxy variable.
Participatory methods provide the alternative of concentrating efforts, e.g. into group discussions, to understand a community in detail. The use of visual tools such as matrices, diagrams, maps, time lines, etc., can generate useful quantitative information as well as insights into the community and its dynamics. This is not possible when information is collected from individual, standardised interviews.
In other cases the data that we seek may be difficult to gather from one meeting with a respondent. For example, in a survey on household incomes, we may find that trust needs to be established, which necessitates research staff working within a village for an extended period as is often done in social anthropological studies.
In our view these "alternatives" are "complementary" to the survey. For example, if aerial photos are available, combining their use with a (smaller) survey of "ground truth" might be better than devoting all the effort to a single method. It must be stressed that the selection of methodologies feasible in a given context has to be closely linked with the primary objectives of the research. It is naive to expect that the output from one methodology will provide the same information as that of another. Combinations of methodologies may be helpful, yet pose a challenge in terms of synthesising the information they provide.
[TOP]
3. Setting up a Survey
3.1 Setting objectives
Objectives must be clearly specified. There should also be statements detailing how the survey is intended to contribute to project outputs and how the results will be synthesised with other information. Related to the specification of the objectives is a detailed definition of intended survey outputs, delimiting the target population in space and time, and setting quantity, quality and time requirements on the data. This should have the force of a signed memorandum of understanding, representing the consensus on which survey activity planning is based.
Surveys are often part of large projects with elaborately-developed perceptions of the information need. There is then a risk that one survey is expected to address multiple objectives, with contradictory demands for information quality, quantity and timing. In such cases, linked modules may be better, e.g. a short questionnaire for a large representative sample, linked to some in-depth studies on smaller subsets. The set of surveys is designed so no respondent is overburdened.
There are surveys where the main objectives are to assess frequencies, for example how many farmers use soil conservation techniques, or fertiliser, or what proportion of women practise some form of family planning. An alternative type of objective concerns relationships, for example between erosion and soil fertility, between yields and fertiliser, between education and family size.
3.2 Survey management
A properly organised survey requires well-thought-out managerial procedures. Some technical aspects are discussed on later pages, but developing, or using, them effectively is dependent on aspects of the process that include the following:
Resourcing review
Include a clear commitment of the necessary manpower, equipment, facilities, material and "political" support to ensure the exercise can be carried through.
Time budgeting
Allow for preparatory activity, pilot testing, seasonal delays, holidays, absences, equipment breakdowns, error correction, and contingencies. Use activity analysis, identifying who does what when, e.g. using Gantt charts or critical path analysis.
Training plan for staff
Include manuals of survey procedures, assessment materials, interviewers' and supervisors' instructions, training of interviewers and supervisors for the interviewing process, and the necessary organisation for staff to be sent on appropriate courses, or for trainers or consultants to be brought in.
Schemes for dissemination and utilisation of results
Consider several variant outputs, maybe including (i) feedback to contributors, of local data relevant to them, (ii) discussion materials for a workshop, (iii) an executive summary for policy-makers, (iv) reports, and (v) a computerised archive for those who will use the data in future.
3.3 Some types of survey
Snap-shot survey
Surveys are frequently cross-sectional i.e. constitute a point-in-time "snapshot". To give a representative picture, a descriptive survey sample has to encompass the target population effectively. Other things being equal it should ideally cover well-defined and different sections of the population (strata) proportionately to their size.
Baseline survey
Sometimes it is known or likely that a "before" snapshot will be used as a project baseline, to be compared with an "after" picture, e.g. for impact assessment. Fair comparison depends on collecting information on the same footing before and after, and this makes it important to keep careful records. To take account of changes through time not due to the project, control studies should ideally be done at the same before and after times in locales unaffected by the project - but within the domain to which the survey results are to be generalised.
Longitudinal survey
When surveying the same population just twice, before and after a lengthy period, it often strengthens the comparison if one visits the same panel of respondents.
When looking repeatedly through time to observe features as they evolve, a long sequence of visits to the same informants may be burdensome, causing ever more perfunctory responses, or withdrawal of cooperation. A rolling programme of panel recruitment and retirement may be sensible, so no informant is troubled too often.
Comparative survey
For a comparison between subgroups, e.g. attitudes of rural and peri-urban farmers, it may be more important to ensure the comparison is of like with like e.g. in terms of the age composition of the samples of farmers, than to achieve full representative coverage of the population. To make the comparison effective, ensure the samples from each subgroup are big enough. Often this will mean the samples should be of equal size even if the subgroups are not equally common in the population: sampling strategies need to be sensible in relation to study objectives.
[TOP]
4. Questionnaire Design
If one had to choose a single indicator of a successful survey it would be the questionnaire. It is, after all, the means by which the data are acquired. A good questionnaire does not guarantee a useful survey, but unless the questionnaire is well designed there will be little of value from the survey.
The structure of the questionnaire will depend on many factors, including whether the survey is postal or by interview, whether the population is a general one or a specific group, e.g. managers, heads of households, women. However, some general guidelines can be given.
* There is increasing evidence that a thoughtfully-crafted introduction can be very important as it establishes a rapport. For example, it dispels any suspicion that the questioner works for the tax-gatherers, it introduces the themes and purpose of the survey. The introduction also develops the respondents' mind-set, for example by getting respondents to go over past events and recall situations that will inform the interview. Practice during training, and effective supervision, should ensure interviewers reliably cover the right topics.
* Transparency of intent should be established in the introduction and by following clear lines of questioning e.g. sections on household demographics, land tenure, crops, livestock. Within sections, it may be useful to follow a regular sequence of question types e.g. facts, practices, knowledge, attitudes and beliefs.
* All questions to be included must be consistent with the objectives of the survey. It is often when the questionnaire is being planned that realisation dawns that the objectives have not been specified sufficiently precisely.
* Constructing an effective questionnaire is a time consuming process. Researchers inexperienced in questionnaire design should recognise that it is easy to construct a questionnaire, difficult to construct one that is effective. To avoid rambling or obscure questions, put some issues and words in (i) a preamble, (ii) lists of permitted answers, or (iii) reiteration, confirmation, and extension of the first response.
* If questions demand recall, should checklists be given to help memory-jogging? This needs thought: partial lists may bias the response pattern.
* How many alternatives should be given for attitude questions? Often there are five, ranging from "strongly agree" to "strongly disagree", unless one wishes to deny the respondent the lazy choice of a mid-point, which sometimes has no meaning. Careful thought is needed to synthesise the attitude measures into a meaningful indicator or "profile". Often informants ought to participate directly in deciding the importance of profile elements.
* Open questions, which allow freer expression, require disciplined data collection and may be difficult to summarise.
* Translators inexperienced in survey design may not appreciate the precision required in question wording, and with completion instructions and units of measurement. Look out too for formally correct translations that are dialectally or culturally inappropriate.
* There is information from past studies to help with constructive approaches to many problems of questionnaire design. Ask those with relevant experience.
[TOP]
5. Sampling Principles
In early-phase work, delimiting the population range entails finding e.g. maximum variation, or e.g. counter-examples, and may be called "qualitative" or "focused" sampling. Such purposeful sleuthing is usually part of the informal elaboration of research issues. This is not the sampling process with which we are concerned.
The need for objectivity
The later-phase, more formal sampling schemes of concern here are for situations where there exists a broad definition of the surveyed domain if not a listing frame of members. Here an essential feature is objectivity, and hence generalisability.
This requires you to show your results are not at all likely to be affected by selection biases - you cannot achieve this if informants are subjectively selected from the population because they are conveniently available, or already known to you, or particularly compliant. If selection is on the basis that the individuals are deemed "typical" the domain is probably not covered properly. Where the sample is purposively spread to be "representative" this is usually only with respect to one or two predetermined characteristics e.g. wealth ranking; the danger is that in other respects not thus controlled the problems of subjectivity remain.
Random sampling and objectivity
Textbook random sampling, as when prize draw tickets are fairly drawn, provides the basis of objectivity when the population comprises a large number of units, e.g. farmers, villages; and key research outputs are supposed to apply to all of them.
For full effectiveness this needs a sample frame listing the population members. Frames are often out-of-date and need revision. Particularly after being synthesised from several lists, there may be multiple records of some units, while others may be omitted. How important this is depends on the proportion being sampled, and on whether the process causing omissions / multiplicities is related to the subject of study. Care must be taken when the sampling unit is not the listing unit, e.g. a list of farms may be usable only with care to sample cows or households.
Multi-stage sampling in hierarchies as with districts, villages, households is rightly common. Samples of secondary units are clustered in a few primary units; detailed frame-making is only done if needed e.g. full listing of villages in selected districts. Primary units are often well-known and individual to the point where sampling at random is irrelevant. It may be quite undesirable if the sample size is small; if two counties picked from England and Wales - by whatever method - were Cumberland and Clwyd the farming scene would come out weighted towards sheep. Objective sampling is more necessary at the ultimate unit level; here subjects are anonymous to users of the whole survey, and what could be serious bias - if there were subjective selection - will be concealed in the results.
Thought is needed on how to distribute sampling effort over stages e.g. more villages in fewer districts will make some results more and others less "precise".
Replication, stability and generalisability
If you do not incorporate appropriate elements of objectivity in sampling, you can expect to get systematic biases in your results. Even if you do, results will fluctuate from one sample to another, and adequate sample size is needed to produce usefully stable results. Without demonstrating stability - without adequate sample size - your conclusions don't generalise!
Replication is necessary in terms of numbers of informants, but it is important to think carefully about what has and has not been replicated. If the survey unit is a focus group and its rapporteur, several focus groups per rapporteur, and several rapporteurs, are needed to generalise findings to other settings where both the focus groups and the rapporteurs will be different.
There is an argument for replicating the whole survey process as several small "cloned" surveys done independently. The variation in a result from clone to clone, then incorporates all the sources of variability in the survey and is an overall measure of stability of findings.
Stratified sampling
When it is known that the population divides into segments which differ from each other but are internally relatively homogeneous e.g. (i) subsistence farms dependent on maize, and (ii) tobacco plantations, use the knowledge. This requires that strata can be identified when the sample is drawn. Homogeneity means that you can use a relatively small sample within each stratum, yet get a plausible, clear and accurate picture. You need not use identical surveys in different strata; specialised modules can be used and reported within the relevant stratum. Sample sizes need not be proportional to stratum sizes, provided you take them into account in the analysis.
Where relevant an overall picture can be put together, if you have a reasonable idea of the relative numbers in the different strata. Resist multiple stratification by agro-ecological zone, by farmer's age and sex and so on: each combination is like a mini-survey; it is easy to end up with too many combinations. Plan the survey, and sample sizes, using relevant and effective stratification variables. If you record extra stratification variables in the field, post-stratify some results during analysis, but without controlling the numbers in each such segment.
Stratified sampling, is for qualitatively different types of unit, e.g. (i) rice farms, (ii) vegetable farms. It is important to distinguish this from probability proportional to size (PPS) sampling e.g. the size of a farm may be measured in terms of its area: bigger farms may be quantitatively more important to total yield, and more important to sample if accurate yield estimation is a key target. Correct PPS sampling and its analysis takes size into account so that overall numerical summaries are estimated correctly. Care should be taken in selecting an appropriate measure of "size", i.e. one that is relevant to the study.
[TOP]
6. Statististical Aspects of Running a Survey
6.1 Pilot studies
6.2 Non-response
6.1 Pilot studies
When surveys fail, it is often because the pilot study phase was omitted, or given too little time or attention. Consider piloting the survey a number of times so that changes are themselves pilot-tested, and so that staff are familiarised with different settings. Pilot studies provide many opportunities e.g. to see how long an interview takes, to check on the viability of the questionnaire as a "script", to assess and to contribute to the training of interviewers. They provide a framework and a little data for working out what analyses will be interesting, and to make sure you are not collecting data that you won't finally analyse and report.
The main purpose of pilot testing is to learn how informants respond to the survey. Look for any clues that there are sensitivities, or difficulties with willingness to give truthful information, remembering items, understanding concepts and words, or carrying out tasks like mental arithmetic. Cut down the information demand if your schedule induces fatigue, boredom or restiveness.
6.2 Non-response
Survey non-response arises when a targeted respondent fails to cooperate with the survey. If this means he is an unusually busy person, is especially secretive, or has a different relationship than other people with the interviewer, then it is hard to work out how he would have responded. If there is a high proportion of non-respondents, a worrying question arises as to whether the remaining sample represents the intended population. Replacing each non-respondent with someone more compliant will not answer this question.
Survey procedures should minimise non-response e.g. by accommodating busy informants (short questionnaires, flexible times for interview, appointments made in advance), or e.g. by winning the confidence of the reticent (transparent purpose, confidentiality guarantees), or e.g. by careful selection and training of interviewers (subject matter knowledge, ability to portray survey objectives honestly but positively, ability to work objectively).
When some sector of the population clearly cannot be surveyed effectively, it may be best to acknowledge this at the start and redefine the target population which the survey will claim to represent.
6.3 Quality in survey data entry
General considerations of good practice in data management apply here. Some particular features of survey data need special consideration; for example multiple dichotomy and multiple response questions, and ranking data, require careful thought.
Even if the survey is not an in-depth case-study, the human informant, often in conjunction with an interviewer, should have provided data that make up a consistent individual profile. The most effective phase of survey data checking may be when the completed questionnaire is considered holistically by a thoughtful reader.
Frequently survey datasets are so large that data entry is mind-numbing, and it is worthwhile to set up data entry screens, simulating the questionnaire, and to build in checks, so as to avoid entries in the wrong field, to catch out-of-range numbers due to mis-keying, and to match skip instructions in the survey schedule. These procedures speed up data entry, and improve quality compared to the amateur approach of typing into a spreadsheet and assuming the numbers are accurate.
Ensuring accuracy is best done by independent re-entry of all the data in a program with a verification facility. In principle, independent double data entry should produce two files where subtracting one from the other produces zeros everywhere when all errors have been fixed. This can be achieved with even quite rudimentary software.
It is worth setting aside three times as much time as required for single data-entry, so as to undertake data-entry, verification, and reconciliation of values that disagree in the two versions of the data file. Relative to the total cost of the survey, it is absurd to risk the quality of its outputs, by false economy or slack procedure at this stage.
Coding - primarily of open-form responses - involves developing a code system in the light of the returned forms, and this usually cannot be complete until most questionnaires have been scrutinised. This may mean revisiting early forms to add codes when certain write-in responses turn out to be common enough to merit computerisation. Achieving consistency requires concentrated thought and effort.
6.4 Survey data analysis
See our forthcoming booklets Informative Presentation of Tables, Graphs and Statistics and Modern Approaches to the Analysis of Survey Data, which apply here. Note that the proposed analysis of the survey has repercussions for its design e.g. sample size, and survey planning can only be fully effective if some forethought goes into analysis objectives and methods e.g. disaggregating data into three-way tables will reveal more than two-way tables, but only if there is enough data.
It is easy to produce cross-tabulations of survey data, but not so simple to produce sets of tables with consistent totals, patterns and percentages, if data are patchy because of item non-response. If the absence of the data is a clue that the missing value was an unusual one, perhaps embarrassing to the informant, any imputation procedure is dubious, and it may be necessary to omit the record for tabulations involving the missing item. Simple computations, e.g. working out a proportion of a proportion, can be misleading if these come from tables based on different subsets of the respondents.
The effective sample size is that net of missing items. If important cross-tabulations are filtered e.g. restricted to currently cohabiting female respondents, the number of qualifying sample members can soon become very low.
Sometimes a "dull" item like "number of employees" is the basis of very many cross-tabulations: losing that item renders the rest of the interview effort useless, which can imply that interviews should be aborted if it cannot be ascertained. Special effort will be needed to elicit such items with care, e.g. developing effective lines of questioning to account for part-time and seasonal employees.
Where overall estimates are derived from structured samples, some effort should be made to convey their accuracy, taking stratification, PPS, or multistage sampling effects into account. Where textbook formulae do not exist, methods based on replicated sampling or design effects may be useful.
Modern statistical procedures commonly involve log-linear modelling, rather than chi-squared tests, which are only very occasionally of much value. Increasingly multi-level modelling suited to hierarchical data is becoming available via major packages. Cluster analysis is a varied body of techniques suitable for dividing respondents into data-derived groups of similarly-responding informants. Interpreting clusters is easier if the responses derive from a well-balanced body of questions, and if somewhat distinct clusters truly exist in the sample. Making timely provision to utilise a competent statistician, or to get a team member trained to utilise the techniques, will ensure greater value can be derived from expensively collected data.
Most analysis procedures and results have to be treated with additional care if non-response affects the datasets analysed.
Often the planned sample is structured, e.g. farm businesses stratified by activity and size. When the achieved sample is differentially affected by non-response, it is informative at least to record and consider the success rates by stratum. It may be desirable to weight actual responses to reflect intended stratum sample sizes - always uncomfortable because the greatest weighting up is required in strata with the highest levels of non-response.
Eliminating non-response is often impossible, but ignoring substantial levels of non-response and reporting survey results as if the achieved sample were what was planned, should be evaluated as seriously inadequate performance. Often the effects of non-response have to be inferred from informal reports by interviewers, and clues from outside the survey. A good survey will have a data quality report, where data collection and handling procedures are carefully reported and appraised post hoc: this should include realistic guidance as to how non-response may for example have biased the results, or led to under-estimates of standard errors.
Last updated 23/04/03