We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: Assessing the Performance of Online Students -- New Data, New Approaches, Improved Accuracy

Abstract: We consider the problem of assessing the changing performance levels of individual students as they go through online courses. This student performance (SP) modeling problem is a critical step for building adaptive online teaching systems. Specifically, we conduct a study of how to utilize various types and large amounts of student log data to train accurate machine learning (ML) models that predict the performance of future students. This study is the first to use four very large sets of student data made available recently from four distinct intelligent tutoring systems. Our results include a new ML approach that defines a new state of the art for logistic regression based SP modeling, improving over earlier methods in several ways: First, we achieve improved accuracy by introducing new features that can be easily computed from conventional question-response logs (e.g., the pattern in the student 's most recent answers). Second, we take advantage of features of the student history that go beyond question-response pairs (e.g., features such as which video segments the student watched, or skipped) as well as information about prerequisite structure in the curriculum. Third, we train multiple specialized SP models for different aspects of the curriculum (e.g., specializing in early versus later segments of the student history), then combine these specialized models to create a group prediction of the SP. Taken together, these innovations yield an average AUC score across these four datasets of 0.808 compared to the previous best logistic regression approach score of 0.767, and also outperforming state-of-the-art deep neural net approaches. Importantly, we observe consistent improvements from each of our three methodological innovations, in each dataset, suggesting that our methods are of general utility and likely to produce improvements for other online tutoring systems as well.
Subjects: Machine Learning (cs.LG); Computers and Society (cs.CY)
Cite as: arXiv:2109.01753 [cs.LG]
  (or arXiv:2109.01753v2 [cs.LG] for this version)

Submission history

From: Robin Schmucker [view email]
[v1] Sat, 4 Sep 2021 00:08:59 GMT (550kb,D)
[v2] Tue, 8 Feb 2022 14:39:59 GMT (501kb,D)

Link back to: arXiv, form interface, contact.