|
Abstract: . . . disambiguation preference (Gibson et al. 1996). A recent study on Dutch shows that Dutch speak- ers preferred to attach relative clauses to NP ¦ , but that the NP ¡ attachment construction shown on the left of Figure 1 was more common in a corpus (Mitchell and Brysbaert 1998). Very recent studies, such as Desmet et al. (2001), however, have shown that human preferences do match corpus preferences when the animacy of the NP ¦ is held constant. Since corpora are used to estimate frequencies in most probabilistic models, this is an important result; I will return to this issue in Section 4.3. But since this control factor concerned the semantics of the noun phrases, it means that a purely structure-frequency account of the Tuning hypothesis cannot be maintained. More recently, Bod (2000) and (2001) showed that frequent three-word (subject-verb-object) sentences (e.g. I like it ) are more easily and faster recognized than infrequent three-word sentences 16 Page 17 (e.g. I keep it ), even after . . . . . . ways. First, consider that a corpus is an instance of language production, but the frequencies derived from corpora are often used to model or control experiments in comprehension. While comprehension and production frequencies are presumably highly correlated, there is no reason to expect them to be identical. Second, the Brown corpus is a genre-stratified corpus. It contains equal amounts of material from newspapers, fiction, academic prose, etc. But presumably a corpus designed for psychological modeling of frequency would want to model the frequency with which an individual hearer or speaker is exposed to (or uses) linguistic input. This would require a much larger focus on spoken language, on news broadcasts, and on magazines. Third, the Brown corpus dates from 1961; most subjects in psycholinguistics experi- ments run in 2001 are college undergraduates and weren’t even born in 1961; the frequencies that would be appropriate to model their language capacity may differ widely from Brown . . . . . . carefully controlling for lexical frequencies and two-word or three-word bigram frequencies. The results of Bod (2001) clearly point to storage of three-word chunks, but it’s not necessary that it is higher-level structure that is playing a causal role. But of course the frequency of complex constructions is much lower than lexical frequencies, and so we expect frequency effects from larger constructions to be harder to find. This remains an important area of future research. 2.7 Summary of Psycholinguistic Results on Frequency and Probability Frequency plays a key role in both comprehension and production, but solid evidence exists only for frequency related in some way to lexical items, or the relationship between lexical items and syntactic structure. High-frequency words are recognized more quickly, with less sensory input, and with less interference by neighbors than low-frequency words. High-frequency words are produced with shorter latencies and shorter durations than low-frequency . . . --3000,3,500,3238,58580
|