Questionnaire Design

Measuring attitudes to domestic violence

Krohn et al. (2012) state that over the years many innovative developments have taken place in the field of Criminology, which includes the development of reliable and valid research measures. Self-report measures in particular have become commonplace. The Attitudes to Domestic Violence Questionnaire (ADV) is a 10-item measure that aims to capture young people’s normative beliefs about how wrong it is for a man to hit a woman and also for a woman to hit a man, under certain conditions. Given that theories of interpersonal aggression highlight the importance of normative beliefs in justifying such actions, it was deemed appropriate to assess attitudes to domestic violence in Phase 1 of the From Boys to Men project. The ADV questionnaire was developed by the From Boys to Men research team in close collaboration with the charitable organisation, Arch (North Staffordshire), in order to evaluate the effectiveness of a domestic abuse prevention education programme called Relationships without Fear (RwF). Inspiration for the questionnaire came from the Normative Beliefs about Aggression Scale (NOBAGS) developed for elementary age children in the US (Huesmann & Guerra, 1997). The NOBAGS has two main sections of questions, one that assesses general beliefs about aggression and another that examines retaliation beliefs about aggression between two children.

The ADV questionnaire

The ADV questionnaire presents five different situations, for example ‘Do you think it is OK for a man to hit his partner/wife if HE says he is sorry afterwards?’ The other four situations are: if the perpetrator has been embarrassed; cheated on; hit first; believes the other person deserves it. For each situation where a man is being violent to a woman, there is a corresponding situation with a woman being violent to a man. A four-point scale invites a response to each situation (10 questions in total)

1 = it’s perfectly OK,    2 = it’s sort of OK,    3 = it’s sort of wrong,    4 = it’s really wrong

Depending on how the question is phrased, the response scale may be presented in reverse order. For those questions that begin ‘Do you think it is OK…’ the scale begins with ‘it’s perfectly OK’. Other questions that are phrased: ‘Suppose [x happened] how wrong…’ have the response scale appearing in reverse order, ie ‘it’s really wrong’ to ‘it’s perfectly OK’. This is to counter the tendency of participants to respond the same way to each question without fully processing what they are being asked (which is known as ‘response bias’). It was decided to provide four response options based on research which suggests that children are often drawn to a neutral mid-point when completing questionnaires. This is known as ‘satisficing’ with the children being drawn to the easy option without fully working through their decision to choose the most optimal response. The questionnaires used in the From Boys to Men research are available in Research Materials.

Scoring the questionnaire responses

The ADV questionnaire is scored so that a high mean score indicates beliefs that are more accepting of domestic violence. Once the questionnaires have been completed and the data entered into a database using computer software (for example, a statistical programme such as SPSS) the items are recoded so that 1 = it’s really wrong and 4 = it’s perfectly OK, for all 10 items. Mean scores are then calculated for each person (by totalling their responses and dividing by the number of items). Researchers refer to a set of items such as this as a ‘scale’ with the responses to all of the items generating a final score for each participant – a total (sum) score or, as in this case, an average (mean) score. This can easily be computed using software such as Excel or SPSS.

Developing the ADV questionnaire

The ADV was originally piloted as a 20-item questionnaire. Ten situations were chosen through discussions with RwF facilitators about the kinds of situations in which young people tend to condone domestic violence, as well as drawing inspiration from previous studies (eg Burman and Cartmel, 2005; Burton et al., 1998). The starting point when developing a measurement scale is always to specify the nature of the concept. Often there are many features which need to be incorporated into the pool of items (Howitt & Cramer, 2011). Researchers then proceed to develop a pool of items to measure the variable of interest. Howitt and Cramer (2011) provide some very useful tips for writing questionnaire items (see p. 254 of their book ‘Introduction to Research Methods in Psychology’; see also the companion website: http://wps.pearsoned.co.uk/ema_uk_he_howitt_resmethpsy3/)

Piloting the questionnaire

Once a set of items have been developed they then need to be tested on a sample of individuals. The ADV questionnaire was piloted in a study of 542 pupils from eleven primary and two secondary schools, all of whom undertook the RwF programme. All participants completed the ADV questionnaire at pre- and post-test. The reliability statistics were reviewed, as well as the facility indices, to identify items that could be deleted, improving the overall reliability of the four sub-scales. The facility index refers to the mean scores (and Standard Deviation) across respondents; an extreme score with little variation suggests that most respondents are agreeing (or disagreeing) with the item. ‘Internal reliability’ refers to the extent to which items are correlated with each other. Put another way, do people respond in the same way to items that are supposed to be measuring the same construct? Items can be dropped if they do not are not highly correlated with other items.

Following analysis of the pilot study data, various situations were removed from the original questionnaire: if a man/woman is drunk;if a man/woman is angry with his/her partner; if a man/woman loves his/her partner; if a man/woman gets on his/her partner’s nerves; if a man/woman shouts at his/her partner. These items were removed because they showed low variability in children’s responses, i.e. a large proportion of the children were saying it was wrong. The Flesch reading ease score for the 10 items is 82, which is US grade 7 (12-13 year olds) according to the Flesch Kincaid Grade Level test. For this reason the questionnaire should be used with caution with younger children.

The 10 items selected for the final ADV questionnaire showed an acceptable level of ‘internal reliability’. Furthermore, factor analysis, using principal components analysis, indicated a clear single factor structure. As noted by Howitt and Cramer (2011), a scale can be ‘unidimensional’ – reflecting a single underlying dimension, or ‘multidimensional’ – reflecting two or more underlying dimensions where there are distinct clusters or groups of items. Factor analysis is a statistical technique that examines the underlying structure of a test or questionnaire. It was possible that the ADV was a multidimensional scale – measuring two separate, albeit related constructs such as ‘attitudes to male-female violence’ and ‘attitudes to female-male violence’. However, the results of the factor analysis suggest the presence of a single construct (a unidimensional scale) which we can call ‘attitudes to domestic violence’.

Reliability of the ADV questionnaire

In a second study, the ADV questionnaire was administered to 112 pupils aged 13–15 years from two secondary schools on two occasions, two weeks apart, and the children’s scores from the two points of testing were then correlated. The test re-test correlation of .72 was deemed satisfactory, demonstrating an acceptable level of reliability over time. In the same way that a thermometer should provide a reliable assessment of a child’s temperature, we wanted our tool for measuring young people’s attitudes to domestic violence to provide a reliable assessment of that construct. Unfortunately, psychological concepts are not as precisely definable. However, we can assess the reliability of the scale in terms of the ‘internal reliability’ of the scale and its ‘test re-test reliability’. If we expect a psychological concept such as ‘children’s attitudes to domestic violence’ to be fairly stable over time then we should expect the measurement of this variable to be relatively stable.

In pilot work, there was an improvement in children’s attitudes from pre- to post-test and this change was statistically significant; this indicated that the ADV was fit for purpose in being able to identify subtle attitudinal changes that can otherwise be hard to detect. When evaluating domestic abuse prevention education programmes the challenge is how to capture children’s nuanced perceptions of domestic abuse. Most young people will state that hitting a partner is wrong which can create a ‘floor effect’ whereby children’s scores in general are already quite low. The inclusion of situations in which young people might be willing to condone violence has enabled a more accurate assessment of their attitudes and the ability to detect the subtle shift in children’s attitudes to more extreme disapproval of violence.

It was decided to use ‘hitting’ and no other examples of abusive behaviours in order to develop a short questionnaire that could be easily understood by young people, completed quickly, and one that could be used by practitioners to evaluate the effectiveness of their own interventions. To include other types of abusive behaviours would have resulted in a lengthy questionnaire with many different sub-scales within it. When developing a new measure it is important to specify the construct that you intend to measure. With greater specificity, the greater the chance of the measure having acceptable levels of ‘internal reliability’. On the other hand, the advantage of using a wide range of abusive behaviours is that the measure more accurately reflects the construct of interest – often referred to as ‘content validity’. Howitt and Cramer (2011) usefully state that when assessing content validity researchers need to ask, ‘Do the items of the scale cover the important characteristics of the concept being measured?’ (p. 273). Our measure could be criticised for being too narrowly focused. Thus, we would encourage other researchers to expand the ADV and develop this further by including different types of abusive behaviours, as well as different types of relationships (e.g. same-sex).

Previous measures have suffered from limitations in terms of low levels of internal consistency, length/utility, and/or data missing on the test-retest reliability of these scales. We would argue that the ADV questionnaire shows great promise for use in evaluations of domestic abuse prevention programmes to enable those in the field to build up the evidence based in the UK and elsewhere. For further details of the development of the ADV see Fox, Gadd and Sim (2013).