The University of Edinburgh -
Division of Informatics
Forrest Hill & 80 South Bridge

MSc Thesis #92146

Title:A Statistical Approach to Syntax Acquisition
Date: 1992
Abstract:[Finch & Chater 91] presented a statistical approach to syntax acquisition, where lexical items are clustered according to the similarity of the distribution of the context in which they occur. With raw, written Natural Language input, the resulting clusters corresponded well to the standard syntactic categories (noun, verb, etc.) Such results suggest that nativist assumptions regarding the existence of "innate prototypes" for syntactic categories may be unnecessary. Here, the behaviour of Finch and Chater's method was investigated, by observation of the interaction between the properties of the grammar underlying the input, the parameters of the analysis, and the structure of the output, using input generated by simple artificial grammars. Further experiments produced a stronger validation of the approach, using as input samples of child-directed adult speech, and obtaining clusters which closely matched syntactic categories. These results are discussed in the light of the possible extensions of the approach, from a feasibility proof, into a fully-fledged model of the bootstrapping of Natural Language syntax.

