Missing phonemes are perceptually restored but differently by native and non-native listeners

The perceptual similarity of the ‘added’ phoneme (phoneme + noise) to the original sound, and the perceptual similarity of the ‘replaced’ phoneme (noise only) to the original sound were evaluated on an eight-point scale (8: very similar, 1: not similar). A subject-wise analysis was performed to see any difference between native and non-native speakers. Table 1 shows the mean similarity scores of English native speakers (NS) (N = 30) and Japanese native speakers who spoke English as a second language (NNS) (N = 30). The mean similarity score was computed for eight different conditions: 2 noise conditions (added vs. replaced) × 2 phonemic conditions (nasal vs. liquid) × 2 lexical conditions (word vs. non-word). The mean was firstly computed for each subject, by computing their average scores over thirty trials per condition, and the grand mean was computed for each condition. The marginal mean was also computed and displayed in Table 1.

Table 1

The mean similarity scores on an eight-point scale

NS native speakers of English, NNS non-native speakers of English (Japanese native speakers of English who spoke English as a second language

The restorability of a missing phoneme was observed in the difference between the ‘added’ and ‘replaced’ scores. When the difference between the added (original phoneme behind noise) and replaced scores (NO original phoneme behind noise) is small, it means, that the present and absent phonemes were equivalently perceived. When the difference between the added and replaced scores is big, it means, that the present and absent phonemes were differently perceived. Figure 1 shows the average scores in bar graphs, with error bars representing the standard deviation. In general, the ‘added’ phoneme (phoneme + noise) yielded higher similarity scores than the ‘replaced’ phoneme (noise only). It seems that the present phoneme in the added condition was differentiated from the absent phoneme in the replaced condition, while the absent phoneme was perceptually restored and perceived as if it was present.

https://static-content.springer.com/image/art%3A10.1186%2Fs40064-016-2479-8/MediaObjects/40064_2016_2479_Fig1_HTML.gif
Fig. 1

The mean similarity scores on an eight-point scale, with error bars representing standard deviations. The scores were evaluated by English native speakers (NS) and non-native speakers (NNS)

Figure 2 shows the difference between the similarity scores of nasals and liquids, computed as “nasals (added) minus liquids (added)” in the solid bar, and “nasals (replaced) minus liquids (replaced)” in the stripe bar for both native and non-native speakers. The results suggest that the difference between nasals (added) and liquids (added) was .13 for native speakers and .22 for non-native speakers, while the difference between nasals (replaced) and liquids (replaced) was .38 for native speakers and .50 for non-native speakers. The positive value in Fig. 2 indicates the greater perceptibility (added) and restorability (replaced) of nasals than liquids. It seems that both native and non-native speakers perceived the added and replaced nasals more than liquids, but the perception of nasals, in general, was greater among non-native speakers than native speakers. Additionally, a paired t test was carried out with the similarity scores of nasals (added) of native and non-native speakers, and with the similarity scores of liquids (added) of native and non-native speakers, to see whether or not the perceptibility of phonemes behind noise (added) is meaningfully different among native and non-native speakers. The results suggested that native speakers (M = 6.91, SD = .56) perceived nasals behind noise (added) significantly more than non-native speakers (M = 6.40, SD = .90), t (29) = 3.14, p = .004, and native speakers (M = 6.78, SD = .68) perceived liquids behind noise (added) significantly more than non-native speakers (M = 6.18, SD = .97), t (29) = 3.30, p = .003. In general, native speakers perceived phonemes behind noise (added) significantly more than non-native speakers. In addition, another paired t-test was carried out with the similarity scores of nasals (replaced) of native and non-native speakers, and with the similarity scores of liquids (replaced) of native and non-native speakers, to see whether or not the restoration size of native and non-native speakers was meaningfully different. The results suggested that native speakers (M = 6.21, SD = .67) restored missing nasals (replaced) greatly more than non-native speakers (M = 5.85, SD = .98), t (29) = 1.89, p = .069, and native speakers (M = 5.83, SD = .79) restored missing liquids (replaced) greatly more than non-native speakers (M = 5.35, SD = 1.15), t (29) = 1.98, p = .058. While nasals (replaced) are restored more than liquids (replaced) in general, the restoration size of native speakers seems to be bigger than that of non-native speakers. Taken together, native speakers were significantly better than non-native speakers at perceiving speech signals behind noise (added) and greatly better at restoring missing phonemes behind noise (replaced), while both native and non-native speakers perceived nasals more than liquids.

https://static-content.springer.com/image/art%3A10.1186%2Fs40064-016-2479-8/MediaObjects/40064_2016_2479_Fig2_HTML.gif
Fig. 2

The difference between the similarity scores of nasals and liquids. Computed as “nasals (added) minus liquids (added)” on the left in the solid bar and “nasals (replaced) minus liquids (replaced)” on the right in the stripe bar for native (NS) and non-native speakers (NNS)

Figure 3 shows the difference between the similarity scores of words and non-words, computed as “words (added) minus non-words (added)” in the solid bar, and “words (replaced) minus non-words (replaced)” in the stripe bar for both native and non-native speakers. The results suggest that the difference between words (added) and non-words (added) was .21 for native speakers and ?.05 for non-native speakers. On the other hand, the difference between words (replaced) and non-words (replaced) was .43 for native speakers and .06 for non-native speakers. The positive value in Fig. 3 indicates the lexical influence in perception of the existing (added) and non-existing (replaced) phoneme behind noise. It seems that native speakers perceived the existing phoneme behind noise (added) slightly better in words than non-words, while non-native speakers perceived the existing phoneme behind noise (added) slightly better in nonwords than words. In addition, native speakers perceptually restored the non-existing phoneme (replaced) better in words than non-words, while non-native speakers restored the non-existing phoneme (replaced) in words and non-words equivalently. There seems to be lexical influence in perception of the existing and non-existing phoneme among native speakers, but little influence among non-native speakers. It seems that lexical context helped the perceptual sensitivity (added) and phonemic restoration (replaced) of native speakers, while it did not help those of non-native speakers.

https://static-content.springer.com/image/art%3A10.1186%2Fs40064-016-2479-8/MediaObjects/40064_2016_2479_Fig3_HTML.gif
Fig. 3

The difference between the similarity score of words and non-words. Computed as “words (added) minus non-words (added)” on the left in the solid bar and “words (replaced) minus non-words (replaced)” on the right in the stripe bar for native (NS) and non-native speakers (NNS)

An ANOVA was firstly carried out respectively for English native speakers and non-native speakers, with two noise factors (added vs. replaced), two phonemic factors (nasal vs. liquid), and two lexical factors (word vs. non-word) as within-subject factors. As for native speakers, the added sound was perceived significantly more similar to the original sound than the replaced sound to the original sound, F (1, 29) = 66.75, p  .001. A nasal consonant in noise was perceived significantly more similar to the original sound than a liquid consonant in noise to the original sound, F (1, 29) = 33.00, p  .001. The target sound in words was perceived significantly more than the target sound in non-words, F (1, 29) = 8.39, p = .007. There was a significant two-way interaction between noise and phonemic factors, F (1, 29) = 15.21, p = .001, a significant two-way interaction between noise and lexical factors, F (1, 29) = 3.96, p = .056, and a significant two-way interaction between phonemic and lexical factors, F (1, 29) = 11.48, p = .002, and a significant three-way interaction among noise, phonemic, and lexical factors, F (1, 29) = 11.03, p = .002. As for non-native speakers, the added sound was perceived significantly more similar to the original sound than the replaced sound to the original sound, F (1, 29) = 38.15, p  .001. A nasal consonant in noise was perceived significantly more similar to the original sound than a liquid consonant in noise to the original sound, F (1, 29) = 38.66, p  .001. On the other hand, the target sound in words and non-words were perceived equivalently, F (1, 29) = .01, p = .95. There was a significant two-way interaction between noise and phonemic factors, F (1, 29) = 9.07, p = .005, but no significant two-way interaction between noise and lexical factors, F (1, 29) = 2.23, p = .15, and no significant two-way interaction between phonemic and lexical factors, F (1, 29) = 2.15, p = .15, and no significant three-way interaction among noise, phonemic, and lexical factors, F (1, 29) = .03, p = .86. Taken together, both native and non-native speakers perceptually differentiated the added and replaced sound. In addition, both native and non-native speakers perceived the nasal in noise significantly more similar to the original sound than the liquid in noise to the original sound (a significant two-way interaction between noise and phonemes among both native and non-native speakers). However, while native speakers perceived the target sound in words better than the target sound in non-words, non-native speakers perceived the target sound in words and non-words equivalently (a significant two-way interaction between noise and lexicality for native speakers, but no significant two-way interaction for non-native speakers). It seems that the lexicality supported the perception of native speakers, while lexicality did not support the perception of non-native speakers.

Additionally, an ANOVA was carried out with two language groups as between-subject factors (native vs. non-native), and two noise factors (added vs. replaced), two phonemic factors (nasal vs. liquid), and two lexical factors (word vs. non-word) as within-subject factors. The results suggested that the speech perception of native speakers was significantly different from that of non-native speakers, F (1, 58) = 5.70, p = .02. However, as was suggested in the previous paragraphs, native and non-native speakers share some commonalities. The added phoneme was perceived significantly more similar to the original sound than the replaced phoneme to the original sound by both native and non-native speakers, F (1, 58) = 101.11, p  .001; there was no two-way interaction between noise and language factors, F (1, 58) = .75, p = .39. In addition, a nasal consonant in noise was perceived significantly more similar to the original sound than a liquid consonant in noise to the original sound by both native and non-native speakers, F (1, 58) = 71.11, p  .001; there was no two-way interaction between phonemic and language factors, F (1, 58) = 1.96, p = .17. On the other hand, while the target sound in words was perceived significantly more similar to the original sound than the target sound in non-words to the original sound, F (1, 58) = 3.80, p = .056, there was a slight two-way interaction between lexical and language factors, F (1, 58) = 3.46, p = .068. As was suggested in the respective ANOVA in the previous paragraph, this would suggest that lexical context supported the perceptual sensitivity (added) and phonemic restoration (replaced) of native speakers, while it did not support those of non-native speakers. As for interactions, there was a significant two-way interaction between noise and phonemic factors, F (1, 58) = 22.11, p  .001, and no significant three-way interaction among noise, phonemic, and language factors, F (1, 58) = .06, p = .82. There was also a significant two-way interaction between noise and lexical factors, F (1, 58) = 6.17, p = .016, and no significant three-way interaction among noise, lexical, and language factors, F (1, 58) = .68, p = .41, while the respective ANOVA in the previous paragraph showed a significant interaction between noise and lexical factors for native speakers, and no significant interaction for non-native speakers. There was also a significant two-way interaction between phonemic and lexical factors, F (1, 58) = 11.79, p = .001, and no significant three-way interactions among phonemic, lexical, and language factors, F (1, 58) = 1.86, p = .18, while the respective ANOVA in the previous paragraph showed a significant two-way interaction between phonemic and lexical factors for native speakers and no-significant interaction for non-native speakers. On the other hand, there was a significant three-way interaction among noise, phonemic, and lexical factors, F (1, 58) = 5.13, p = .027, and a significant four-way interaction among noise, phonemic, lexical, and language factors, F (1, 58) = 3.94, p = .052.

As a whole, both native and non-native speakers perceived the difference between the ‘added’ (phoneme + noise) and ‘replaced’ sound (noise only). In addition, both native and non-native speakers perceived the present and absent nasals significantly more than liquids. The only difference between native and non-native speakers was the lexical support for perception; while native speakers perceived the present and absent phoneme in words better than non-words, non-native speakers perceived the present and absent phoneme in words and non-words equivalently (a significant three-way interaction among noise, phonemes, and lexicality, and a significant four-way interaction among noise, phonemes, lexicality, and language factors). Phonemic restoration seems to take place differently with the different ratio of bottom-up acoustic processing and top-down lexical processing among native and non-native speakers.