In linguistics, co-occurrence or cooccurrence is an above-chance frequency of ordered (occurrence) of two adjacent terms in a text corpus. Co-occurrence in this linguistic sense can be interpreted as an indicator of (semantic proximity) or an idiomatic expression. Corpus linguistics and its statistic analyses reveal patterns of co-occurrences within a language and enable to work out typical collocations for its lexical items. A co-occurrence restriction is identified when linguistic elements never occur together. Analysis of these restrictions can lead to discoveries about the structure and development of a language.
Co-occurrence can be seen an extension of word counting in higher dimensions. Co-occurrence can be quantitatively described using measures like correlation or mutual information.
See also
- (Distributional hypothesis)
- (Statistical semantics)
- Idiom (language structure)
- (Co-occurrence matrix)
- (Co-occurrence networks)
- Similarity measure
- (Dice coefficient)
References
- Kroeger, Paul (2005). Analyzing Grammar: An Introduction. Cambridge: Cambridge University Press. p. 20. ISBN .
External links
- Bordag, Stefan (2008). "A Comparison of Co-occurrence and Similarity Measures as Simulations of Context". pp. 52–63. CiteSeerX 10.1.1.471.5863.