Johnson, Sally A., 1995:

Gender, group identity and variation in the Berlin vernacular: a sociolinguistic study.

Bern: Peter Lang. 223pp.
Paperback, ISBN _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

> Reviewed by

James Milroy

Program in Linguistics, University of Michigan
E-mail: Download this review (24K, Rich Text Format)]
Copyright Notice:

First published in Web Journal of Modern Language Linguistics in association with the publishers (to be announced). © 1996 James Milroy.

The moral rights of the author(s) to be identified as author(s) of this work are asserted in accordance with ss.77 and 78 of the Copyright, Designs and Patents Act 1988. This work may be reproduced without the consent of the author, in part or in whole in any manner and in any medium subject only to the two following conditions:
(a) no charge shall be made for the copy containing the work or the excerpt,
(b) a copy of this notice shall precede the work or the excerpt.

This book gives an account of a study of the usage of Berlin urban vernacular variants by a friendship group of young people - 13 men and 11 women. The core is Chapter 4, in which the results are presented, and it has the very great virtue of being highly accountable to the data, much more so than most published papers on sociolinguistics. The latter tend to present findings and relate these to some theoretical question, without giving a full account of the methodology used. One desirable consequence of this carefulness is that the reader of this book can judge for himself/ herself the quality of the argumentation that is based on the findings. The demonstration of the close relation between vernacular usage and group identity (or network) seems to me to be a spectacular confirmation of the importance of group identity, and I would judge this finding to be the most reliable in the study. However, this is not the key finding presented on the back cover, which is: "Though some differences emerge in the way men and women appear to use the B(erlin) U(rban) V(ernacular) in order to identify with the group, no differences are found in the actual quantity of dialectal variants used by men and women" (my italics). The last claim is not correct, and it is necessary to look at the statistical methods used in order to show why the author has felt able to come to such a conclusion.

In fact the results presented within the book show that there are differences between speakers and that many of these differences are differences between men and women. The question is whether the gender differences are statistically significant. Table 4.26 on page 184 shows that the very consistent gender-difference figures are likely to be significant (see below). On page 183, the author notes that the differences between men and women on particular variables are sometimes "fairly substantial, reaching over 15%". This is misleading: the differences are usually much greater than this. The differences between the sexes are given in Table 4.26 by simply subtracting one percentage score from the other. Thus, the difference between 30.5 and 21.2, for example, is given as 9.3 when the ratio of one score to the other is more than 1.40; i.e., men use the vernacular variant over 40% more than women do. In several cases the difference is over 50%. On the "micro-variable" (s-3) a difference reported as 21.1 % is actually over 60%. The difference is in fact substantial, and as there are 427 tokens of (s-3) including 218 vernacular tokens, it is possible that a level of significance may be reached on these Ns. On macrovariables, which may have thousands of tokens, statistically significant results appear quite likely. However, the main reason why the author's conclusion is unsatisfactory is that for the 14 relevant variables, the results always go in the same direction: men score higher than women in every single case.

As the null hypothesis would state that the male scores should be the same as the female, it is violated in every case, but violated in the same direction. The likelihood of this happening by chance is vanishingly low. If, for example, a coin were tossed fourteen times it is very unlikely indeed that it would come out heads every time. Thus, it is clear even without a formal statistical test that this is not likely to be random or accidental. In fact it can be subjected to a sign-test, which would establish overwhelming significance: the chances of this consistent pattern happening by accident over 14 variables are roughly 8,000 to 1 against. The findings establish as they stand that when males and females are grouped for counting purposes, the group scores are different in every case and that the male groups always score higher than females. What is interesting is why the author should choose to ignore such a clear and consistent pattern.

Johnson's conclusion that men and women are not different in favouring vernacular variants depends on the use of a particular statistical test on the data - the Mann-Whitney test. This test applies a rank score to each individual from highest to lowest in vernacular use. As there is considerable overlap of individual speakers according to gender, it is perhaps not surprising that the test should determine that the differences for male and female mean scores are not significant. What it shows (p. 189) is that the distribution in terms of rank order and gender could have happened by chance for each variable singly, and this conclusion can be quickly supported by glancing at the tables and graphs (these are very clearly set out). The scores for individual men and women are interwoven, with some females having higher vernacular scores than some males. The test does not take account of the Group Identity Score, and although it considers gender groupings it does not deal with the 14 vernacular scores for gender when considered simultaneously. The pattern revealed by it is familiar in gender groupings - and in social class groupings for that matter.

However, the Mann-Whitney test is, in statistical terms, not a very powerful test, as it deals with percentage scores rather than raw scores, and if individual scores are used in this way, it is quite likely that the differences will not be significant. Log-linear modelling would be more powerful and might determine significance here in some cases. The claim that gender differences are not significant depends on the kind of test used and the kind of assumptions made in the test. Given that the gender group figures always tend in the same direction, the Mann-Whitney tells us only that if individual means are used, they overlap in a pattern that does not conform consistently to sex of speaker. What the general findings seem to show, therefore, is that group identity (measured by a "Group Identity Score") accounts for more of the variance than gender does in this particular study, as high scores on vernacular usage seem to correspond better to high scores on group identity than to maleness. This is evident from the graphs. However, it does not follow that gender has no statistical effect.

Johnson's decision not to use statistical testing on the pattern for all 14 variables considered simultanously seems to be related to the idea that such grouping for statistical purposes implies that males and females are thought to be "corporate groups" (p. 184 and elsewhere), and she is anxious to dispel any such illusion. However, no one to my knowledge has ever suggested that males and females form separate corporate groups in western societies, and a decision to group them for statistical purposes makes no judgment about their social roles or groupings. The quantificational stage of the research is blind to evaluations of this kind and makes no judgment about gender beyond the judgment that it is reasonable to classify people by sex. What actually happens in large speech community studies is that one independent speaker variable is found to account for more of the variation than another in some particular case. For example, in current work in Newcastle gender is found to be overwhelmingly statistically significant (moreso than class) for two vowel variables, but in other cases class accounts for more of the variation than gender. No claim is made in such studies that men and women form "corporate" groups in social reality.

The study is based on a group of 24 persons who meet to play volleyball, and a Group Identity Score (GIS) is calculated on the basis of nine indicators. As always in such calculations it can be questioned whether the indicators are the most appropriate and whether they are correctly weighted. It is good to see that time-depth of acquaintance with other speakers is included, as this is very difficult to incorporate into network measures that (unlike this study) cover different age-groups. Everyone knows that time-depth is important. It is not clear, however, why level of education is included as a minus quantity (points are subtracted for high levels of education). If there is an assumption that high levels of education will encourage standardness of language (as there appears to be), then it is not valid to anticipate this effect on language when standard/ nonstandard language is what is to be measured. To exclude this indicator would have minor effects on the rank order of individual speakers. In general the indicators used are entirely reasonable, and future researchers will learn from them. The fieldwork method is clearly reported and is excellent: the investigator invited groups of people to meals and cooked for them - an ingenious way of guaranteeing that a reasonable amount of casual speech would be recorded.

The hypotheses tested are: (1) that there is a positive correlation between a high GIS and usage of the vernacular variant; (2) that there is a more significant correlation for men than for women between GIS and vernacular usage; and (3) that the vernacular variant is used significantly more frequently by men than by women (the null hypothesis is repeatedly referred to here as the "nil-hypothesis" - this is new to me). For hypothesis (3) the Mann-Whitney is used, and the first two hypotheses are tested by the Spearman Rank Order Correlation test. In all 14 cases (the fifteenth variable (g-3) can be dismissed as the vernacular variant occurs only twice in 640 tokens; thus, the statistical testing reported on pp 161-2 is otiose), the rank order relation to the scores is highly significant or very highly significant. Thus, the hypothesis is confirmed: group identity correlates closely with heavy vernacular usage. This, in my view, is the main substantive finding, and it is a useful and valuable finding that confirms the importance of solidary relationships in vernacular maintenance.

As for the second hypothesis, there is no reason to expect that it will be confirmed, as group identity is in principle independent of gender as a speaker variable. Thus, if the GIS indicators are not biased towards one gender, the hypothesis should not be confirmed. In many cases the males show a closer rank order correlation than the females, but in some, the females turn out to be closer. This does not prove anything about gender-based preference for vernacular forms, as the test is not based on gender difference but on the GIS rank order. Where one gender scores higher than another, it means only that the rank order within the gender group is closer to the rank order of the language scores, not necessarily that one gender uses more non-standard language than the other for social purposes. We know already that men use more vernacular forms on average than women: the results of this test do not contradict this finding, no matter how much it is submerged in the discussion. Within the groups, however, it could be shown that certain individuals affirm social identity by high vernacular score more than others do, and in some cases that even when female scores are lower, the vernacular variants have a more positive "network" meaning for certain women than for the higher-scoring men.

The linguistic variables quantified in this study aes, but not normally in the USA. Following Johnston (1983) I have used the term "divergent dialect communities" for these, and they are common in the northern British Isles. The Berlin situation is one in which Low German variants remain in casual usage as High German variants gradually progress. Historically, such communities maintain residues of a strong dialect contact situation in an urban environment, and one of the alternants is normally recessive but extremely salient. The (pull) alternation in Belfast is one such case.

I have discussed the possible social functions of these alternants in J Milroy (1992a: 98-100; 1992b: 154-57) in terms of a "maintenance/ change" model of language change. It is suggested that they persist as long as they are functional as markers of personal closeness and recede to the extent that in-group members contract weak ties outside the immediate community. Interactionally, they can be linked up with accommodation theory and politeness theory. Johnson's work is particularly welcome for the careful analysis of these vernacular alternations. Her variables are often broken down into micro-variables that cover one single frequently occurring word. The variable (s), for example, is broken down into four sub-variables, three of which quantify the results on a single word (es, das, was alternating with et, dat, wat) When the numbers are sufficient it is very important that this should be done, as it is possible that conflation of the results may conceal very different trends in the different items involved. The idea that the micro-variables are sub-categories of a superordinate "variable" must, however, be used with care. If the variance shown in different environments or in different lexical items is substantially different in linguistic terms, we are actually dealing with separate variables, and the investigator has to make a decision on delimiting these separate variables early in the research. In historical terms, of course, these variants are near the end of an S-curve of socio-lexical diffusion - and all the more interesting for that.

As for the rest of the book, there is a useful background to Berlin vernacular, with an assessment of the work of Dittmar and Schlobinski, and a discussion in Chapter 2 of social network, largely in terms of the deficiencies of social network measurements that smuggle in assumptions about gender. My colleague (L Milroy 1987) has been criticised for using an indicator of network strength based on having "the same place of work as to others of the same sex", and this may have some force in it, but some of the other criticisms are odd. The Belfast research project was not comparable to the present study as it was a large speech community study and not a bounded group study, and the indicators used had to be general - for example "voluntary association with workmates in leisure hours". The idea that this is not equitable as between men and women on the grounds that women have less leisure time is disputable (p. 73) and may depend on social status (amongst other things). As most of the men in our Belfast sample were unemployed and therefore did not have workmates (more of the women were in employment), we may equally well be accused of discriminating against men. In fact, many men had low scores as a result of being unemployed. In principle the measure is gender-neutral. Johnson's study differs in that she is studying a bounded group and not a series of representative open-ended communities. It is therefore possible to ascertain whether people go to the same pub once a week and so on. In a large speech community study, the constraints of time and expense require that we use more generalised indicators of social network strength. But what I am in general agreement about is that the socialising patterns of the two sexes are clearly different in ways that might affect the comparability of scores for social network or some similar variable. Women are believed to contract friendships that are qualitatively different from male friendships. There may be fewer of these friendships, and they may last for a longer time and be more intense and socially supportive. How these are to be measured is the first question. How they are to be used statistically as network measures when they make assumptions about gender difference is the second. I agree that there is often a problem with network indicators, largely because network is not a simplex variable that can be directly observed in the way that sex difference can be.

There is a tendency in this book for the author to appear to attack deficiencies in earlier work without apparently being aware that her own work may also have deficiencies and that some of her claims are disputable. Some of the claims in early chapters actually seem to be contradictory. There are comments in the last chapter to the effect that much previous work may be unreliable because it has not used the correct statistical tests. It is by no means clear that she has used all the correct tests herself in view of her interest in gender. I don't know if she is actually implying that there is really no difference in general in male and female use of vernacular variants. So many studies have shown that there is a consistent difference, using statistics to confirm their findings, that it cannot be true in general that there is no difference. Sometimes gender accounts for more of the variance than other social variables do, and sometimes, as in this case, another variable is more important. I come away from this book with the feeling that the author believes that statistical tests have actually explained something, when they are not designed to explain, but to assess the degree of confidence we can have in our results. As the figures reported here on male/ female differences are substantial and move in a consistent direction for 14 variables, she has actually failed to state clearly a highly significant finding that is in her own data. What seems to emerge from repeated patterns on the graphs is that in this study the network (GIS) variable accounts for more of the variation than the gender variable, but the reader has a right to know why the author claims that there is NO gender difference when the male group scores are always higher than the female. The study is of much more value for its treatment of group identity than its treatment of gender, and we cannot generalise from it that gender is unimportant.

Another point that must be made is that the author does not seem to be aware of quite recent work on gender in social dialectology (such as Horvath 1985, L Milroy 1992, Milroy and Milroy 1993, etc), which has made a case for gender as an important independent social variable that plays a part in linguistic change. It may also be helpful to her to consider Milroy and Milroy (1985) and J Milroy (1992a) on the relevance of social network to linguistic change. There are also some questions about the presentation of statistical results: the significance levels should have "more than" or "less than" arrows attached, and in many cases, the exact probability should be reported - not the significance level. Amongst other things, the reader should be able to see whether a non-significant result comes close to the conventional 0.05 level. (for all we know it might be 0.051 and therefore not significantly different from 0.05). Similarly, if one significance level is <0.01 and another <0.05, it does not follow that the difference between the two significance levels is itself significant. More care should be taken in statistical work if we are to agree with the author's claims about her findings and on the importance of using statistics.

I have had to be rather critical of this book, and I regret this because it has so many virtues - chiefly the finding on the importance of group identity. The fieldwork method and the linguistic findings are extremely clearly presented. It is a model study in some respects, and we need many more of them. Ironically, it is this clarity that enables a reviewer to ask awkward questions about the claims made on the findings. This book will certainly make an impact and will be much debated.


I am grateful to David Walshaw for his help and advice on statistical matters.


Horvath, Barbara, 1985. Variation in Australian English. Cambridge: Cambridge University Press.

Johnson, Paul, 1983. Irregular style variation patterns in Edinburgh speech. Scottish Language 2.1-19.

Milroy, James, 1992a. Linguistic variation and change. Oxford: Blackwell.

Milroy, James, 1992b. Social network and prestige arguments in sociolinguistics. In: Bolton, K and H Kwok, Sociolinguistics today: international perspectives. London: Routledge, 146-62.

Milroy, James, forthcoming. A current change in British English: variation in (th) in Derby. Newcastle and Durham Working Papers in Linguistics 3.

Milroy, James and Lesley Milroy, 1985. Linguistic change, social network and speaker innovation. Journal of Linguistics 21, 339-84

Milroy, James and Lesley Milroy, 1993. Mechanisms of change in urban dialects: the role of class, social network and gender. International Journal of Applied Linguistics 3.1, 57-77.

Milroy, Lesley, [1980] 1987. Language and social networks, 2nd ed. Oxford: Blackwell.

Milroy, Lesley, 1992. New perspectives in the analysis of sex differentiation in language. In Bolton, K and H Kwok, Sociolinguistics today: international perspectives. London: Routledge, 163-79.

Return to The Issue 2 contents page | The WJMLL Home Page