Neural Embeddings for Web Testing
[Preprint]| Authors | |
| conference | ICST 2026 - 19th IEEE Conference on Software Testing, Verification and Validation. |
Abstract
Web test automation techniques often rely on crawlers to infer models of web applications for automated test generation. However, current crawlers rely on state equivalence algorithms that struggle to distinguish near-duplicate pages, leading to redundant test cases and incomplete coverage. In this paper, we present a model-based test generation approach that employs transformer-based Siamese neural networks (SNNs) to infer web application models more accurately. By learning similarity-based representations, SNNs capture structural and textual relationships among web pages, improving near-duplicate detection during crawling and enhancing the quality of inferred models, and thus, the effectiveness of generated test suites. Our evaluation across nine web apps shows that SNNs outperform state-of-the-art techniques in near-duplicate detection, resulting in superior web app models with an average F1 score improvement of 56%. These enhanced models enable the generation of more effective test suites that achieve higher code coverage, with improvements ranging from 6% to 21%.