We gratefully acknowledge support from
the Simons Foundation and member institutions.

Software Engineering

New submissions

[ total of 13 entries: 1-13 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 9 Jun 23

[1]  arXiv:2306.04892 [pdf, other]
Title: X-COBOL: A Dataset of COBOL Repositories
Comments: 5 pages
Subjects: Software Engineering (cs.SE); Programming Languages (cs.PL)

Despite being proposed as early as 1959, COBOL (Common Business-Oriented Language) still predominantly acts as an integral part of the majority of operations of several financial, banking, and governmental organizations. To support the inevitable modernization and maintenance of legacy systems written in COBOL, it is essential for organizations, researchers, and developers to understand the nature and source code of COBOL programs. However, to the best of our knowledge, we are unaware of any dataset that provides data on COBOL software projects, motivating the need for the dataset. Thus, to aid empirical research on comprehending COBOL in open-source repositories, we constructed a dataset of 84 COBOL repositories mined from GitHub, containing rich metadata on the development cycle of the projects. We envision that researchers can utilize our dataset to study COBOL projects' evolution, code properties and develop tools to support their development. Our dataset also provides 1255 COBOL files present inside the mined repositories. The dataset and artifacts are available at https://doi.org/10.5281/zenodo.7968845.

[2]  arXiv:2306.04958 [pdf, other]
Title: Towards a Success Model for Automated Programming Assessment Systems Used as a Formative Assessment Tool
Subjects: Software Engineering (cs.SE)

The assessment of source code in university education is a central and important task for lecturers of programming courses. In doing so, educators are confronted with growing numbers of students having increasingly diverse prerequisites, a shortage of tutors, and highly dynamic learning objectives. To support lecturers in meeting these challenges, the use of automated programming assessment systems (APASs), facilitating formative assessments by providing timely, objective feedback, is a promising solution. Measuring the effectiveness and success of these platforms is crucial to understanding how such platforms should be designed, implemented, and used. However, research and practice lack a common understanding of aspects influencing the success of APASs. To address these issues, we have devised a success model for APASs based on established models from information systems as well as blended learning research and conducted an online survey with 414 students using the same APAS. In addition, we examined the role of mediators intervening between technology-, system- or self-related factors, respectively, and the users' satisfaction with APASs. Ultimately, our research has yielded a model of success comprising seven constructs influencing user satisfaction with an APAS.

[3]  arXiv:2306.05032 [pdf, other]
Title: Scalable and Adaptive Log-based Anomaly Detection with Expert in the Loop
Subjects: Software Engineering (cs.SE); Machine Learning (cs.LG)

System logs play a critical role in maintaining the reliability of software systems. Fruitful studies have explored automatic log-based anomaly detection and achieved notable accuracy on benchmark datasets. However, when applied to large-scale cloud systems, these solutions face limitations due to high resource consumption and lack of adaptability to evolving logs. In this paper, we present an accurate, lightweight, and adaptive log-based anomaly detection framework, referred to as SeaLog. Our method introduces a Trie-based Detection Agent (TDA) that employs a lightweight, dynamically-growing trie structure for real-time anomaly detection. To enhance TDA's accuracy in response to evolving log data, we enable it to receive feedback from experts. Interestingly, our findings suggest that contemporary large language models, such as ChatGPT, can provide feedback with a level of consistency comparable to human experts, which can potentially reduce manual verification efforts. We extensively evaluate SeaLog on two public datasets and an industrial dataset. The results show that SeaLog outperforms all baseline methods in terms of effectiveness, runs 2X to 10X faster and only consumes 5% to 41% of the memory resource.

[4]  arXiv:2306.05152 [pdf, ps, other]
Title: Towards Autonomous Testing Agents via Conversational Large Language Models
Subjects: Software Engineering (cs.SE)

Software testing is an important part of the development cycle, yet it requires specialized expertise and substantial developer effort to adequately test software. The recent discoveries of the capabilities of large language models (LLMs) suggest that they can be used as automated testing assistants, and thus provide helpful information and even drive the testing process. To highlight the potential of this technology, we present a taxonomy of LLM-based testing agents based on their level of autonomy, and describe how a greater level of autonomy can benefit developers in practice. An example use of LLMs as a testing assistant is provided to demonstrate how a conversational framework for testing can help developers. This also highlights how the often criticized hallucination of LLMs can be beneficial while testing. We identify other tangible benefits that LLM-driven testing agents can bestow, and also discuss some potential limitations.

[5]  arXiv:2306.05336 [pdf, other]
Title: Improving the Reporting of Threats to Construct Validity
Comments: 5 pages. EASE conference, Oulu, 2023
Subjects: Software Engineering (cs.SE)

Background: Construct validity concerns the use of indicators to measure a concept that is not directly measurable. Aim: This study intends to identify, categorize, assess and quantify discussions of threats to construct validity in empirical software engineering literature and use the findings to suggest ways to improve the reporting of construct validity issues. Method: We analyzed 83 articles that report human-centric experiments published in five top-tier software engineering journals from 2015 to 2019. The articles' text concerning threats to construct validity was divided into segments (the unit of analysis) based on predefined categories. The segments were then evaluated regarding whether they clearly discussed a threat and a construct. Results: Three-fifths of the segments were associated with topics not related to construct validity. Two-thirds of the articles discussed construct validity without using the definition of construct validity given in the article. The threats were clearly described in more than four-fifths of the segments, but the construct in question was clearly described in only two-thirds of the segments. The construct was unclear when the discussion was not related to construct validity but to other types of validity. Conclusions: The results show potential for improving the understanding of construct validity in software engineering. Recommendations addressing the identified weaknesses are given to improve the awareness and reporting of CV.

Cross-lists for Fri, 9 Jun 23

[6]  arXiv:2306.04930 (cross-list from cs.HC) [pdf, other]
Title: When to Show a Suggestion? Integrating Human Feedback in AI-Assisted Programming
Comments: arXiv admin note: text overlap with arXiv:2210.14306
Subjects: Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Software Engineering (cs.SE)

AI powered code-recommendation systems, such as Copilot and CodeWhisperer, provide code suggestions inside a programmer's environment (e.g., an IDE) with the aim to improve their productivity. Since, in these scenarios, programmers accept and reject suggestions, ideally, such a system should use this feedback in furtherance of this goal. In this work we leverage prior data of programmers interacting with Copilot to develop interventions that can save programmer time. We propose a utility theory framework, which models this interaction with programmers and decides when and which suggestions to display. Our framework Conditional suggestion Display from Human Feedback (CDHF) is based on predictive models of programmer actions. Using data from 535 programmers we build models that predict the likelihood of suggestion acceptance. In a retrospective evaluation on real-world programming tasks solved with AI-assisted programming, we find that CDHF can achieve favorable tradeoffs. Our findings show the promise of integrating human feedback to improve interaction with large language models in scenarios such as programming and possibly writing tasks.

[7]  arXiv:2306.05057 (cross-list from cs.CR) [pdf, other]
Title: SmartBugs 2.0: An Execution Framework for Weakness Detection in Ethereum Smart Contracts
Subjects: Cryptography and Security (cs.CR); Software Engineering (cs.SE)

Smart contracts are blockchain programs that often handle valuable assets. Writing secure smart contracts is far from trivial, and any vulnerability may lead to significant financial losses. To support developers in identifying and eliminating vulnerabilities, methods and tools for the automated analysis have been proposed. However, the lack of commonly accepted benchmark suites and performance metrics makes it difficult to compare and evaluate such tools. Moreover, the tools are heterogeneous in their interfaces and reports as well as their runtime requirements, and installing several tools is time-consuming.
In this paper, we present SmartBugs 2.0, a modular execution framework. It provides a uniform interface to 19 tools aimed at smart contract analysis and accepts both Solidity source code and EVM bytecode as input. After describing its architecture, we highlight the features of the framework. We evaluate the framework via its reception by the community and illustrate its scalability by describing its role in a study involving 3.25 million analyses.

[8]  arXiv:2306.05078 (cross-list from cs.CY) [pdf, other]
Title: Eliciting the Double-edged Impact of Digitalisation: a Case Study in Rural Areas
Comments: Accepted to IEEE RE 2023, International Conference on Requirements Engineering, 10 pages plus 2 pages of references
Subjects: Computers and Society (cs.CY); Software Engineering (cs.SE)

Designing systems that account for sustainability concerns demands for a better understanding of the \textit{impact} that digital technology interventions can have on a certain socio-technical context. However, limited studies are available about the elicitation of impact-related information from stakeholders, and strategies are particularly needed to elicit possible long-term effects, including \textit{negative} ones, that go beyond the planned system goals.
This paper reports a case study about the impact of digitalisation in remote mountain areas, in the context of a system for ordinary land management and hydro-geological risk control. The elicitation process was based on interviews and workshops. In the initial phase, past and present impacts were identified. In a second phase, future impacts were forecasted through the discussion of two alternative scenarios: a dystopic, technology-intensive one, and a technology-balanced one. The approach was particularly effective in identifying negative impacts.
Among them, we highlight the higher stress due to the excess of connectivity, the partial reduction of decision-making abilities, and the risk of marginalisation for certain types of stakeholders. The study posits that before the elicitation of system goals, requirements engineers need to identify the socio-economic impacts of ICT technologies included in the system, as negative effects need to be properly mitigated. Our study contributes to the literature with: a set of impacts specific to the case, which can apply to similar contexts; an effective approach for impact elicitation; and a list of lessons learned from the experience.

Replacements for Fri, 9 Jun 23

[9]  arXiv:2304.11384 (replaced) [pdf, other]
Title: Large Language Models are Few-Shot Summarizers: Multi-Intent Comment Generation via In-Context Learning
Comments: Accepted by the 46th International Conference on Software Engineering (ICSE 2024)
Subjects: Software Engineering (cs.SE)
[10]  arXiv:2305.03803 (replaced) [pdf, other]
Title: A Survey of Trojans in Neural Models of Source Code: Taxonomy and Techniques
Subjects: Software Engineering (cs.SE)
[11]  arXiv:2305.17384 (replaced) [pdf, other]
Title: WELL: Applying Bug Detectors to Bug Localization via Weakly Supervised Learning
Comments: (Preprint) Software Engineer; Deep Learning; Bug Detection & Localization
Subjects: Software Engineering (cs.SE)
[12]  arXiv:2306.01788 (replaced) [pdf, other]
Title: Responsible Design Patterns for Machine Learning Pipelines
Comments: 20 pages, 4 figures, 5 tables
Subjects: Software Engineering (cs.SE); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[13]  arXiv:2306.04281 (replaced) [pdf, other]
Title: HornFuzz: Fuzzing CHC solvers
Subjects: Software Engineering (cs.SE)
[ total of 13 entries: 1-13 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2306, contact, help  (Access key information)