The distributional hypothesis in linguistics is that words that occur in similar contexts tend to have similar meanings (Harris, 1954).
This hypothesis is the justification for ap- plying the VSM to measuring word similarity. A word may be represented by a vector in which the elements are derived from the occurrences of the word in various contexts, such as windows of words (Lund & Burgess, 1996), grammatical dependencies (Lin, 1998; Pad ́o & Lapata, 2007), and richer contexts consisting of dependency links and selectional preferences on the argument positions (Erk & Pad ́o, 2008); see Sahlgren’s (2006) thesis for a comprehensive study of various contexts. Similar row vectors in the word–context matrix indicate similar word meanings.
The idea that word usage can reveal semantics was implicit in some of the things that Wittgenstein (1953) said about language-games and family resemblance.
Wittgenstein was primarily interested in the physical activities that form the context of word usage (e.g., the word brick, spoken in the context of the physical activity of building a house), but the main context for a word is often other words.
http://www.jair.org/media/2934/live-2934-4846-jair.pdf