Florien J. Koopmans-van Beinum*, Caroline E. Schwippert* & Cecile T.L. Kuijpers**

* University of Amsterdam, Institute of Phonetic Sciences/IFOTT, The Netherlands
** Catholic University of Nijmegen, IWTS, The Netherlands


This paper concentrates on a small, but essential part within a large national Dutch research program on developmental dyslexia, namely the development of auditory test material for experiments with children from birth onwards. Since it is likely that the basis of any phoneme awareness in children is laid already in their first year of life, it is of great importance to follow the perceptual development of children at risk of dyslexia from birth onwards. In order to investigate the nature and origin of the perceptual deficit dyslexic people are afflicted with, a number of auditory tests are designed. The various steps in the development of the definite set of auditory tests, to be used in the actual research program, the rationale underlying these various steps, and first results of the pilot tests are described in the present paper.


Within the framework of a national Dutch research program ("Identifying the Core Features of Developmental Dyslexia: A Multidisciplinary Approach") 300 Dutch children born with a genetic risk for dyslexia and 120 control children will be tested longitudinally during ten years from birth onwards, with respect to a number of medical, neurophysiological, visual, auditory, and linguistic aspects. Since no auditory perception tests are available that are suitable for very young Dutch children and that can be used (probably in adapted form) during the whole period of ten years for the children as well as for adults, we are faced with the need to develop a set of tests to measure the auditory sensitivity of the subjects to speechlike stimuli.
Literature on the possible deficiency in auditory perception in dyslexic children points into the direction of less distinct phoneme boundaries: classification as well as discrimination tests show lower consistency in dyslexic than in control children [3, 4, 6]. Especially in (synthetic) speech sound continua for place of articulation, where normally categorical perception can be demonstrated, this lower consistency is obvious. Results so far have been interpreted quite differently: at the one hand the phenomenon is explained as a general deficit in auditory temporal perception [5], at the other hand as being phonetic in origin for those contrasts that are acoustically similar [2]. The latter authors showed that syllables that can be discriminated more easily, made differences between good and poor readers disappear.
In order to link the development of our auditory tests with this ongoing discussion, we designed sets of stimuli that had to meet a number of requirements:
- based on natural speech
- spoken by a female speaker
- three continua: `basic': /bA-dA/, `difficult': /bA-wA/, `easy': /bA-zA/
- by means of some kind of acoustical interpolation method
- with equal duration of all stimuli
- with varying interstimulus intervals (ISI)
- to be used in a classification test
- to be used in a discrimination test (AX-paradigm)
- combined with reaction time measurements
- supplemented with ERP measurements
- to be tested on adults first, but to be adapted for infants and young children at a later stage.


2.1. Stimulus materials

Choice of stimulus material. Natural speech of a female speaker was considered to be the best starting point to construct the stimuli, since the ultimate use of the stimuli will be in tests with infants and children, and the stimuli have to resemble everyday speech as much as possible. Therefore speech material of an adult female speaker was recorded on a DAT-recorder in a anechoic room. She pronounced a series of one-syllable words, among which the target words /bAk/, /dAk/, /zAk/, and /wAk/, all being normal Dutch words, meaning `tray', `roof', `bag', and `hole in the ice', respectively. Our selection of the three continua that might differ in ease of discrimination was based on results of Dutch consonant confusion matrices. The selection of acceptable interpolation methods was one of the main subjects of our study and is described in detail below. Since we tried to avoid durational differences in the stimuli to be fused with difficulties in temporal auditory processing, we prefered all stimuli to have equal duration. By means of varying the interstimulus interval (ISI) and by measuring reaction times we supposed to get information on the temporal processing.
The /bAk/-/dAk/ continuum. The acoustical analysis data of the words /bAk/ and /dAk/ was our starting point for further manipulations. The original F2 onset values of the transitions were about 1100 and 1800 Hz, respectively, whereas F3 did not show too much difference. The F0 range was between 125 and 235 Hz, mean F0 being 165 Hz in both words. The realization of the word /bAk/ was selected as the starting stimulus signal for the /bAk-dAk/ continuum. It was manipulated with the Praat software package [1]. The signal was down-sampled to 11025 Hertz to be able to analyse it with Linear Predictive Coding (LPC). The LPC-analysis was done with 10 linear prediction parameters, window width being 25 ms, time step 5 ms, and pre-emphasis frequency being 50 Hz. An interpolation of the F2 was done which always resulted in a linear transition. The manipulated part of the signal was a 100 ms' interval at the beginning of the vowel. Number of interpolation steps was 10 and the F2 onset ranged from 1100 to 1800 Hz. The occlusion period of [k] had to be reduced with 40 ms to 55 ms to obtain equal stimulus duration for all three continua. This did not influence the quality of the signal. The interpolation and additional processing resulted in a 10 point continuum, the total length of each item being 600 ms, consisting of (a) a vocal murmur 170 ms; (b) burst 10 ms; (c) vowel [A] 250 ms, divided into 100 ms transition duration and 150 ms steady state; (d) occlusion period [k] (silence) 55 ms; (e) release [k] 115 ms. The [bAk]-starting point was characterized by a constant level of the F2 at 1100 Hz. For each intermediate signal, the F2 onset was gradually situated at higher frequencies, making the fall of the transition steeper in every step. At the [d]-endpoint of the continuum the transition was falling from 1800 Hz to 1100.
The /bAk/-/zAk/ continuum. The perceptual contrast between the Dutch phonemes /b/ and /z/ is very large: they are seldom confused. One might even question whether one can speak of an actual /b/-/z/ continuum. Differences in place and manner of articulation combined with a completely different temporal structure of the phoneme seem to be difficult to overcome. First of all, the same formant manipulation was performed as for the /bAk/-/dAk/ continuum since [b] and [z] show the same F2 contrast as [b] and [d], due to a comparable difference in place of articulation. Secondly interpolation was necessary between the burst of the [b] and the voiced fricative noise of [z]. So we omitted the first part (the vocal murmur, the burst and 2 vocalic periods) and replaced it by the 170 ms fricative part of a /zAk/ utterance at a very low intensity at the /bAk/-endpoint and at normal intensity near the /zAk/-endpoint. The intermediate forms of the fricative part were made by increasing the intensity by 3 dB in every step going from /bAk/ to /zAk/. The only stimulus without any fricative noise added was the /bAk/-endpoint, this was necessary to obtain a high quality realization of /bAk/.
In summary two parameters were manipulated going from /bAk/ to /zAk/ in ten steps: F2 transition onset increased with approximately 80 Hz and fricative noise intensity doubled in each step. This resulted in signals with a total length of 600 ms, consisting of (a) a fricative noise 170 ms; (b) vowel [A] 240 ms, devided into 90 ms transition duration and 150 ms steady state; (c) occlusion period [k] (silence) 75 ms; (d) release [k] 115 ms.
The /bAk/-/wAk/ continuum. It is important to note that the Dutch contrast /b/-/w/ is different from the English one. In Dutch the initial /w/ is a labio-dental approximant, whereas it is a bilabial approximant in English. Therefore an F1 manipulation can bridge the difference in English between these two phonemes, which in Dutch is not the case. An analysis of our speaker's utterances /bAk/ and /wAk/ revealed that changing the F1 frequencies could not lead to the desired results. To construct a Dutch /b/-/w/ continuum, a different approach was needed. Cutting off increasingly bigger parts of the [[radical]] did produce some of the desired effect, but did not lead to a very acceptable /bAk/ realization. Furthermore this strategy would result in a continuum where each item would have a different length, which we wanted to avoid. We then adopted the following method: the approximant was cut off from the vowel at point "A" in the waveform and was replaced by the vocal murmur and burst of the /bAk/ utterance, which resulted in a very natural /bAk/ realization. The procedure now to create the intermediate signals was to shift point "A" increasingly to the left, first deleting the burst and inserting a period of the original [[radical]] signal, then replacing gradually every period of the vocal murmur of [b] by a period of the original waveform of [[radical]]. This procedure yielded 16 steps. Since we only needed 10 we omitted the four pre-last items at the [[radical]]-endpoint, for they were all clear [[radical]] items, and two items at the [b]-endpoint for the same reason. The 10 resulting items formed a very smooth continuum from [b] to [[radical]]. To make the signals equally long as those of the other continua, some silence was added to the occlusion period, resulting for each signal in a total length of 600 ms, consisting of (a) a voiced labial consonant 180 ms; (b) vowel [A] 250 ms; (c) occlusion period [k] (silence) 55 ms; (d) release [k] 115 ms.

2.2. Perceptual tests

Design. Apart from a screening test for dyslexia, the experimental design consisted of a discrimination test and a classification test. In the discrimination task two ISI's were used (25 and 400 ms). The three continua consisted of 7 stimulus pairs each. The classification test consisted of 10 stimuli for each of the three continua with an intertrial interval of 1500 ms. Details of both tasks are given below.
Subjects. Twelve adult dyslexic subjects (8 male and 4 female) and twelve adult control subjects (4 male and 8 female) participated in the listening tests. Except for 3 dyslexics all subjects were students; they were paid for their participation. None of the subjects reported auditory problems.
All subjects were administered a battery of tests representing a subset of reading- and spelling-relevant skills. Single-word reading was measured by two standardized Dutch reading tests (EMT) and (DMT). These tests consist of cards with real words of increasing difficulty. Both appeal to the lexical decoding skill. Phonological decoding skill was measured by two pseudoword reading tests; a standardized pseudoword reading test which is based on the EMT, and a pseudoword reading test constructed for this purpose, based on the DMT. The measures of single-word spelling were a dictation of 72 isolated real words, and a dictation of 72 isolated pseudowords having the same CV structure as the real words. Phonological awareness was tested by means of a nonword repetition task (40 items). Rapid automatized naming was measured by color, letter, digit, and object naming tests. Eight of the dyslexic subjects performed these tests very poorly, the remainig four performed moderately; the control subjects had no problems.
Experimental procedure. The screening test took three quarters of an hour, the perception test about one hour. Subjects began either with the screening test or with the perception tests. The perception tests were conducted in a room with three sound-attenuated booths. Each booth contained a computer screen, headphones, and a panel of buttons that the subjects had to press to indicate their responses. The buttons were labeled with words and with a corresponding picture or symbol, to avoid lexical confusions. The control panels for the experimental sessions were situated next to the booths. The experimental system NESU was used for real-time stimulus presentation and reaction time registration. The stimuli were presented through headphones. The computer screen in the booth was only used to indicate the beginning of a new block, a time-out, or the end of the experiment. The experimenter could watch the answers given by the subjects, and their reaction times. Each session began with a same-different discrimination task followed by a forced choice classification task. For both discrimination and classification the three continua were presented separately. The order of presentation of the continua was balanced across subjects.
Discrimination. The discrimination task required subjects to discriminate between two stimuli that were always three continuum steps apart (e.g. 1-4, 2-5, 3-6, etc.). The 7 stimulus pairs were presented 12 times. The internal order of a stimulus pair was balanced. In addition to the different trials, ten same trials (1-1, 2-2, etc.) were presented, each pair twice. The stimulus pairs were presented in four blocks of 52 stimuli (6x7 + 1x10), interstimulus intervals (ISI) remaining constant within one block. The first and third block always contained the stimulus pairs separated by a 25 ms ISI, the second and fourth block stimuli were separated by 400 ms ISI. So there was a short-long-short-long ISI-pattern for each continuum. Within blocks stimuli were randomized. The task was preceded by 24 practice stimuli. These were always items of the continuum that was presented first and their ISI was short, like in the first block. No direct feedback was given. Subjects were instructed to listen to the two words presented and to determine whether they sounded "the same" or "different". They were urged to react as adequately and as quickly as possible by pushing the corresponding button. In cases where a subject's reactions were systematically above 1000 ms, the experimenter intervened to make the subject aware of the need to react as quickly as possible. The discrimination task took 45 minutes.
Classification. Subjects were instructed to classify a stimulus as being "bak" or "dak", "bak" or "zak", "bak" or "wak", respectively. Again neither training nor feedback was given, 15 practice trials were presented to accustom the subjects to the task. Each stimulus was presented 12 times, the 120 stimuli of each continuum were randomized not allowing more than 3 identical stimuli in a row. Subjects were instructed to label the words as quickly and as adequately as possible. Again occasionally interventions were made when subjects responded too slowly (>800 ms).


3.1 Discrimination

Analysis. Since an ANOVA on all data revealed significant interactions between subject groups and continua, as could be expected because of the large differences in character of the three continua, we conducted an ANOVA on each of the three continua separately, across subject groups (control and dyslexic), ISI (25 and 400 ms), and stimulus pairs (seven 3-step pairs) with the dependent measure being the same-different discrimination scores. Significant main effects (p<.001) were found for subject groups (dyslexics performing worse) and for stimulus pairs (as could be expected), whereas the main effect of ISI was moderately significant (p<.05).

Figure 1: Mean discrimination functions for three continua with short and long ISI for dyslexic and control subjects.

The interaction between ISI and subject groups, however, was not significant, indicating that for both controls and dyslexics a longer ISI caused similarly better discrimination. A preliminary analysis of reaction times confirmed group differences, dyslexics reacting approximately 150 ms slower. In Fig. 1 mean discrimination functions for all continua with short and long ISI are presented for dyslexics and controls.

3.2 Classification

Analysis. ANOVA on each of the three continua separately across subject groups (control and dyslexic) and stimuli (10 in each continuum), with the dependent measure being the forced choice classification scores, revealed significant main effects (p<.001) for both subject groups (again dyslexics performing worse) and stimuli (as could be expected). In Fig. 2 the mean classification functions are presented for each of the three continua and both subject groups. If we compare the functions of the three continua, it is obvious that dyslexics and controls are more similar for the /bAk/-/dAk/ than for the /bAk/-/wAk/ continuum, which was to be expected if our assumption concerning degree of difficulty of the continua was correct. As for the /bAk/-/zAk/ continuum it turned out that the presence or absence of any noise part caused a phoneme boundary effect for the control group as well as for half of the dyslexic group. The other half situated the phoneme boundary in the middle of the continuum. This observation was confirmed by a posthoc analysis.

Figure 2: Mean classification functions for three continua for dyslexic and control subjects.


The main conclusions from the present experiment can be summarized as follows. Making various speech continua in Dutch based on natural speech, by interpolating one or two speech parameters turns out to provide quite satisfying stimuli. With some small adaptations they are suitable for further use in the dyslexia project. Although results on the /bAk/-/zAk/ continuum evoked questions, it could be demonstrated in our pilot experiments that dyslexic and control subjects behave differently on all three continua in discrimination as well as in classification tests. Longer ISIs turn out to be equally profitable for both subject groups. It still has to be studied whether other ISI values probably do differentiate the two subject groups. Since subjects in our pilot were all adults selected roughly from existing subject pools, and the import of the individual behaviour has to be studied in more detail, our results are promising in the scope of the Dutch dyslexia project.