Luigi Libero Lucio Starace, Ph.D.

Assistant Professor @ Università degli Studi di Napoli Federico II, Italy.

When AI codes, Who tests? A teaching case on risk, quality, and governance in LLM-assisted software development

AuthorsAuthor: Luigi Libero Lucio Starace.
JournalJournal of Systems and Software.
DOI10.1016/j.jss.2026.112989
Graphical abstract of the paper. An investigation board with several bug reports linked to possible causes and safeguards.

Graphical abstract of the paper: the bug investigation board.

Abstract

Large Language Models (LLMs) are increasingly integrated into software development workflows, promising dramatic productivity gains but introducing distinctive risks. This teaching case explores different bug patterns in LLM-generated code through a realistic organizational scenario. Set in a mid-sized software company piloting AI-assisted coding tools, the case follows a junior engineer tasked with investigating anomalies in LLM-generated code. Students analyze realistic examples of bugs, ranging from misinterpretations and missing corner cases to hallucinated APIs, and classify them using an empirically grounded framework. In addition to classification, students reflect on how these defects differ from traditional human-introduced bugs and evaluate strategies for detecting them through testing, code review, and governance mechanisms. The narrative culminates in a strategic dilemma: should the company scale up LLM adoption, restrict it to low-risk modules, or suspend the initiative? Designed for software engineering courses, the case fosters critical thinking on quality assurance, risk management, and governance in AI-assisted development, bridging technical analysis with managerial decision-making.

Additional material

Teaching notes and associated materials are available online on the publisher’s platform.