Stances in 'Introduction': Info & Library Science - Introduction 3 - Full text
(1) Select an 'Introduction' right arrow (2) Select a move in that 'Introduction' (What is this?)

Title: A Machine Learning Approach for Identification of Thesis and Conclusion Statements in Student Essays
Author(s): Jill Burstein and Daniel Marcu
Journal: Computers and the Humanities (37), 2003, 455–467.
Move
Introduction 3: Full text

Move 1: Establish A Territory

Software for automated evaluation of student essays has become a prevalent technology over the past few years. Many colleges, universities, public school districts, and language testing organizations use automated essay scoring technologies to provide grades to student essays (Burstein, 2003; Elliott, 2003; Landauer et al., 2003; Larkey and Croft, 2003; Page, 2003). As educators became more comfortable with automated essay scoring technology, they also gained an awareness about the need for more comprehensive analyses of student writing. For example, they were interested in the evaluation of grammar error detection in essays (Leacock and Chodorow, 2003). They also had a strong interest in automated analysis of the essay-based discourse features (Burstein et al., 2003; Burstein and Marcu, 2003).The literature in the teaching of writing suggests that invention, arrangement and revision in essay writing must be developed in order to produce effective writing. Stated in practical terms, students at all levels can benefit from practice applications that give them an opportunity to work on discourse structure in essay writing.

Move 2: Establish A Niche

Teacher’s feedback about students’writing is often expressed in general terms which is of little help; to be useful, the feedback must be grounded and must refer to the specific text of the essay (Scardamalia and Bereiter, 1985; White, 1994). If a system can automatically identify the actual text associated with discourse elements in student essays, then feedback like that used in traditional textbook teaching of writing can be directed toward specific text segments in students writing. These kinds of questions are often used in textbooks to encourage students to reflect on the organizational components in their writing: a) Is the intention of my thesis statement clear? b) Does my thesis statement respond directly to the essay question? c) Are the main points in my essay clearly stated? and d) Does my conclusion relate to my original thesis statement? If these questions were presented, along with specific text segments from students’ essays, this would help students think about specific parts of their essay.

Move 3: Present the Present Work

This study builds on previous work that reports on the identification of a single sentence associated with the thesis statement text segment, using Bayesian classification (Burstein et al., 2001). It relates specifically to system performance with regard to a system’s recognition of the possible multiple text segments corresponding to thesis and conclusion text segments in student writing. A machine learning decision tree algorithm, C5.0 with boosting, was used for model building and labeling. The results indicate that the system can automatically identify features in student writing and can be used to identify thesis and conclusion statements in student essays. In this article, we address the following questions: 1) Can a system be built that reliably identifies thesis and conclusion statements?, 2) Moreover, how does system performance compare to a baseline, and inter-annotator agreement between human judges?, 3) Will the system be able to generalize across genre and grade level to some extent?, and 4) How well does the system generalize to unseen essay responses? That is, can the system identify thesis and conclusion statements on essay topics that it has not been trained on?