We gratefully acknowledge support from
the Simons Foundation and member institutions.

Software Engineering

New submissions

[ total of 19 entries: 1-19 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Mon, 6 Dec 21

[1]  arXiv:2112.01538 [pdf, other]
Title: Testing Reactive Systems Using Behavioural Programming, a Model Centric Approach
Authors: Yeshayahu Weiss
Comments: 31 pages, 7 figures
Subjects: Software Engineering (cs.SE)

Testing is a significant aspect of software development. As systems become complex and their use becomes critical to the security and the function of society, the need for testing methodologies that ensure reliability and detect faults as early as possible becomes critical. The most promising approach is the model-based approach where a model is developed that defines how the system is expected to behave and how it is meant to react. The tests are derived from the model and an analysis of the test results is conducted based on it. We will investigate the prospects of using the Behavioral Programming (BP) for a model-based testing (MBT) approach that we will develop. We will develop a natural language for representing the requirements. The model will be fed to algorithms that we will develop. This includes algorithms for the automatic creation of minimal sets of test cases that cover all of the system's requirements, analysing the results of the tests, and other tools that support the testing process. The focus of our methodology will be to find faults caused by the interaction between different requirements in ways that are difficult for the testers to detect. Specifically, we will focus our attention to concurrency issues such as deadlocks and logical race condition. We will use a variety of methods that are made possible by BP, such as non-deterministic execution of scenarios and use of in-code model-checking for building test scenarios and for finding minimal coverage of the test scenarios for the system requirements using Combinatorial Test Design (CTD) methodologies. We will develop a proof-of-concept tool kit which will allow us to demonstrate and evaluate the above mentioned capabilities. We will compare the performance of our tools with the performance of manual testers and of other model-based tools using comparison criteria that we will define and develop.

[2]  arXiv:2112.01581 [pdf, other]
Title: On the Documentation of Refactoring Types
Comments: arXiv admin note: text overlap with arXiv:2009.09279
Subjects: Software Engineering (cs.SE)

Commit messages are the atomic level of software documentation. They provide a natural language description of the code change and its purpose. Messages are critical for software maintenance and program comprehension. Unlike documenting feature updates and bug fixes, little is known about how developers document their refactoring activities. Developers can perform multiple refactoring operations, including moving methods, extracting classes, for various reasons. Yet, there is no systematic study that analyzes the extent to which the documentation of refactoring accurately describes the refactoring operations performed at the source code level. Therefore, this paper challenges the ability of refactoring documentation to adequately predict the refactoring types, performed at the commit level. Our analysis relies on the text mining of commit messages to extract the corresponding features that better represent each class. The extraction of text patterns, specific to each refactoring allows the design of a model that verifies the consistency of these patterns with their corresponding refactoring. Such verification process can be achieved via automatically predicting the method-level type of refactoring being applied, namely Extract Method, Inline Method, Move Method, Pull-up Method, Push-down Method, and Rename Method. We compared various classifiers, and a baseline keyword-based approach, in terms of their prediction performance, using a dataset of 5,004 commits. Our main findings show that the complexity of refactoring type prediction varies from one type to another. Rename method and Extract method were found to be the best documented refactoring activities, while Pull-up Method and Push-down Method were the hardest to be identified via textual descriptions. Such findings bring the attention of developers to the necessity of paying more attention to the documentation of these types.

[3]  arXiv:2112.01590 [pdf, other]
Title: The Art and Practice of Data Science Pipelines: A Comprehensive Study of Data Science Pipelines In Theory, In-The-Small, and In-The-Large
Subjects: Software Engineering (cs.SE)

Increasingly larger number of software systems today are including data science components for descriptive, predictive, and prescriptive analytics. The collection of data science stages from acquisition, to cleaning/curation, to modeling, and so on are referred to as data science pipelines. To facilitate research and practice on data science pipelines, it is essential to understand their nature. What are the typical stages of a data science pipeline? How are they connected? Do the pipelines differ in the theoretical representations and that in the practice? Today we do not fully understand these architectural characteristics of data science pipelines. In this work, we present a three-pronged comprehensive study to answer this for the state-of-the-art, data science in-the-small, and data science in-the-large. Our study analyzes three datasets: a collection of 71 proposals for data science pipelines and related concepts in theory, a collection of over 105 implementations of curated data science pipelines from Kaggle competitions to understand data science in-the-small, and a collection of 21 mature data science projects from GitHub to understand data science in-the-large. Our study has led to three representations of data science pipelines that capture the essence of our subjects in theory, in-the-small, and in-the-large.

[4]  arXiv:2112.01598 [pdf, other]
Title: Faster Multi-Goal Simulation-Based Testing Using DoLesS (Domination with Least Square Approximation)
Comments: 10 pages, 4 figures, 6 tables. Submitted to ICSE 2022
Subjects: Software Engineering (cs.SE)

For cyber-physical systems, finding a set of test cases with the least cost by exploring multiple goals is a complex task. For example, Arrieta et al. reported that state-of-the-art optimizers struggle to find minimal test suites for this task. To better manage this task, we propose DoLesS (Domination with Least Squares Approximation) which uses a domination predicate to sort the space of possible goals to a small number of representative examples. Multi-objective domination then divides these examples into a "best" set and the remaining "rest" set. After that, DoLesS applies an inverted least squares approximation approach to learn a minimal set of tests that can distinguish best from rest in the reduced example space. DoLesS has been tested on four cyber-physical models: a tank flow model; a model of electric car windows; a safety feature of an AC engine; and a continuous PID controller combined with a discrete state machine. Comparing to the recent state-of-the-art paper attempted the same task, DoLesS performs as well or even better as state-of-the-art, while running 80-360 times faster on average (seconds instead of hours). Hence, we recommend DoLesSas a fast method to find minimal test suites for multi-goal cyber-physical systems. For replication purposes, all our code is on-line:https://github.com/hellonull123/Test_Selection_2021.

[5]  arXiv:2112.01635 [pdf]
Title: A Grounded Theory Based Approach to Characterize Software Attack Surfaces
Comments: This paper has been accepted in the IEEE/ACM International Conference on Software Engineering (ICSE 2022) and is going to be published. Please feel free to cite it
Subjects: Software Engineering (cs.SE)

The notion of Attack Surface refers to the critical points on the boundary of a software system which are accessible from outside or contain valuable content for attackers. The ability to identify attack surface components of software system has a significant role in effectiveness of vulnerability analysis approaches. Most prior works focus on vulnerability techniques that use an approximation of attack surfaces and there has not been many attempt to create a comprehensive list of attack surface components. Although limited number of studies have focused on attack surface analysis, they defined attack surface components based on project specific hypotheses to evaluate security risk of specific types of software applications. In this study, we leverage a qualitative analysis approach to empirically identify an extensive list of attack surface components. To this end, we conduct a Grounded Theory (GT) analysis on 1444 previously published vulnerability reports and weaknesses with a team of three software developers and security experts. We extract vulnerability information from two publicly available repositories: 1) Common Vulnerabilities and Exposures, and 2) Common Weakness Enumeration. We ask three key questions: where the attacks come from, what they target, and how they emerge, and to help answer these questions we define three core categories for attack surface components: Entry points, Targets, and Mechanisms. We extract attack surface concepts related to each category from collected vulnerability information using the GT analysis and provide a comprehensive categorization that represents attack surface components of software systems from various perspectives. The comparison of the proposed attack surface model with the literature shows in the best case previous works cover only 50% of the attack surface components at network level and only 6.7% of the components at code level.

[6]  arXiv:2112.01644 [pdf]
Title: Systematically reviewing the layered architectural pattern principles and their use to reconstruct software architectures
Comments: 30 pages
Subjects: Software Engineering (cs.SE)

Architectural reconstruction is a reverse engineering activity aiming at recovering the missing decisions on a system. It can help identify the components, within a legacy software application, according to the application's architectural pattern. It is useful to identify architectural technical debt. We are interested in identifying layers within a layered application since the layered pattern is one of the most used patterns to structure large systems. Earlier component reconstruction work focusing on that pattern relied on generic component identification criteria, such as cohesion and coupling. Recent work has identified architectural-pattern specific criteria to identify components within that pattern. However, the architectural-pattern specific criteria that the layered pattern embodies are loosely defined. In this paper, we present a first systematic literature review (SLR) of the literature aiming at inventorying such criteria for layers within legacy applications and grouping them under four principles that embody the fundamental design principles under-lying the architectural pattern. We identify six such criteria in the form of design rules. We also perform a second systematic literature review to synthesize the literature on software architecture reconstruction in the light of these criteria. We report those principles, the rules they encompass, their representation, and their usage in software architecture reconstruction.

[7]  arXiv:2112.01771 [pdf, other]
Title: Characterizing Performance Bugs in Deep Learning Systems
Subjects: Software Engineering (cs.SE); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Deep learning (DL) has been increasingly applied to a variety of domains. The programming paradigm shift from traditional systems to DL systems poses unique challenges in engineering DL systems. Performance is one of the challenges, and performance bugs(PBs) in DL systems can cause severe consequences such as excessive resource consumption and financial loss. While bugs in DL systems have been extensively investigated, PBs in DL systems have hardly been explored. To bridge this gap, we present the first comprehensive study to characterize symptoms, root causes, and introducing and exposing stages of PBs in DL systems developed in TensorFLow and Keras, with a total of 238 PBs collected from 225 StackOverflow posts. Our findings shed light on the implications on developing high performance DL systems, and detecting and localizing PBs in DL systems. We also build the first benchmark of 56 PBs in DL systems, and assess the capability of existing approaches in tackling them. Moreover, we develop a static checker DeepPerf to detect three types of PBs, and identify 488 new PBs in 130 GitHub projects.62 and 18 of them have been respectively confirmed and fixed by developers.

[8]  arXiv:2112.02043 [pdf, ps, other]
Title: Multilingual training for Software Engineering
Comments: Accepted at International Conference on Software Engineering (ICSE-2022)
Subjects: Software Engineering (cs.SE); Machine Learning (cs.LG)

Well-trained machine-learning models, which leverage large amounts of open-source software data, have now become an interesting approach to automating many software engineering tasks. Several SE tasks have all been subject to this approach, with performance gradually improving over the past several years with better models and training methods. More, and more diverse, clean, labeled data is better for training; but constructing good-quality datasets is time-consuming and challenging. Ways of augmenting the volume and diversity of clean, labeled data generally have wide applicability. For some languages (e.g., Ruby) labeled data is less abundant; in others (e.g., JavaScript) the available data maybe more focused on some application domains, and thus less diverse. As a way around such data bottlenecks, we present evidence suggesting that human-written code in different languages (which performs the same function), is rather similar, and particularly preserving of identifier naming patterns; we further present evidence suggesting that identifiers are a very important element of training data for software engineering tasks. We leverage this rather fortuitous phenomenon to find evidence that available multilingual training data (across different languages) can be used to amplify performance. We study this for 3 different tasks: code summarization, code retrieval, and function naming. We note that this data-augmenting approach is broadly compatible with different tasks, languages, and machine-learning models.

Cross-lists for Mon, 6 Dec 21

[9]  arXiv:2112.01796 (cross-list from cs.LG) [pdf, other]
Title: The UniNAS framework: combining modules in arbitrarily complex configurations with argument trees
Comments: a laxly written technical presentation of UniNAS and Argument Trees, the code is publicly available
Subjects: Machine Learning (cs.LG); Software Engineering (cs.SE)

Designing code to be simplistic yet to offer choice is a tightrope walk. Additional modules such as optimizers and data sets make a framework useful to a broader audience, but the added complexity quickly becomes a problem. Framework parameters may apply only to some modules but not others, be mutually exclusive or depend on each other, often in unclear ways. Even so, many frameworks are limited to a few specific use cases. This paper presents the underlying concept of UniNAS, a framework designed to incorporate a variety of Neural Architecture Search approaches. Since they differ in the number of optimizers and networks, hyper-parameter optimization, network designs, candidate operations, and more, a traditional approach can not solve the task. Instead, every module defines its own hyper-parameters and a local tree structure of module requirements. A configuration file specifies which modules are used, their used parameters, and which other modules they use in turn This concept of argument trees enables combining and reusing modules in complex configurations while avoiding many problems mentioned above. Argument trees can also be configured from a graphical user interface so that designing and changing experiments becomes possible without writing a single line of code. UniNAS is publicly available at https://github.com/cogsys-tuebingen/uninas

[10]  arXiv:2112.01821 (cross-list from cs.SD) [pdf, other]
Title: Blackbox Untargeted Adversarial Testing of Automatic Speech Recognition Systems
Comments: 10 pages, 6 figures and 7 tables
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Software Engineering (cs.SE); Audio and Speech Processing (eess.AS)

Automatic speech recognition (ASR) systems are prevalent, particularly in applications for voice navigation and voice control of domestic appliances. The computational core of ASRs are deep neural networks (DNNs) that have been shown to be susceptible to adversarial perturbations; easily misused by attackers to generate malicious outputs. To help test the correctness of ASRS, we propose techniques that automatically generate blackbox (agnostic to the DNN), untargeted adversarial attacks that are portable across ASRs. Much of the existing work on adversarial ASR testing focuses on targeted attacks, i.e generating audio samples given an output text. Targeted techniques are not portable, customised to the structure of DNNs (whitebox) within a specific ASR. In contrast, our method attacks the signal processing stage of the ASR pipeline that is shared across most ASRs. Additionally, we ensure the generated adversarial audio samples have no human audible difference by manipulating the acoustic signal using a psychoacoustic model that maintains the signal below the thresholds of human perception. We evaluate portability and effectiveness of our techniques using three popular ASRs and three input audio datasets using the metrics - WER of output text, Similarity to original audio and attack Success Rate on different ASRs. We found our testing techniques were portable across ASRs, with the adversarial audio samples producing high Success Rates, WERs and Similarities to the original audio.

[11]  arXiv:2112.01955 (cross-list from cs.LG) [pdf, other]
Title: You Can't See the Forest for Its Trees: Assessing Deep Neural Network Testing via NeuraL Coverage
Subjects: Machine Learning (cs.LG); Software Engineering (cs.SE)

This paper summarizes eight design requirements for DNN testing criteria, taking into account distribution properties and practical concerns. We then propose a new criterion, NLC, that satisfies all of these design requirements. NLC treats a single DNN layer as the basic computational unit (rather than a single neuron) and captures four critical features of neuron output distributions. Thus, NLC is denoted as NeuraL Coverage, which more accurately describes how neural networks comprehend inputs via approximated distributions rather than neurons. We demonstrate that NLC is significantly correlated with the diversity of a test suite across a number of tasks (classification and generation) and data formats (image and text). Its capacity to discover DNN prediction errors is promising. Test input mutation guided by NLC result in a greater quality and diversity of exposed erroneous behaviors.

[12]  arXiv:2112.01956 (cross-list from cs.LG) [pdf, other]
Title: Enhancing Deep Neural Networks Testing by Traversing Data Manifold
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Software Engineering (cs.SE)

We develop DEEPTRAVERSAL, a feedback-driven framework to test DNNs. DEEPTRAVERSAL first launches an offline phase to map media data of various forms to manifolds. Then, in its online testing phase, DEEPTRAVERSAL traverses the prepared manifold space to maximize DNN coverage criteria and trigger prediction errors. In our evaluation, DNNs executing various tasks (e.g., classification, self-driving, machine translation) and media data of different types (image, audio, text) were used. DEEPTRAVERSAL exhibits better performance than prior methods with respect to popular DNN coverage criteria and it can discover a larger number and higher quality of error-triggering inputs. The tested DNN models, after being repaired with findings of DEEPTRAVERSAL, achieve better accuracy

Replacements for Mon, 6 Dec 21

[13]  arXiv:2010.15738 (replaced) [pdf]
Title: The Agile Coach Role: Coaching for Agile Performance Impact
Comments: 10 pages
Journal-ref: In Proceedings of the 54th Hawaii International Conference on System Sciences, 2021
Subjects: Software Engineering (cs.SE)
[14]  arXiv:2103.12706 (replaced) [pdf]
Title: An Empirical Investigation of Pull Requests in Partially Distributed BizDevOps Teams
Journal-ref: 2021 IEEE/ACM Joint 15th International Conference on Software and System Processes (ICSSP) and 16th ACM/IEEE International Conference on Global Software Engineering (ICGSE) (ICGSE-ICSSP)
Subjects: Software Engineering (cs.SE)
[15]  arXiv:2105.05460 (replaced) [pdf, other]
Title: A Systematic Literature Review on Blockchain Governance
Comments: Submitted to Journal of Systems and Software
Subjects: Software Engineering (cs.SE)
[16]  arXiv:2108.13064 (replaced) [pdf, other]
Title: Trust Enhancement Issues in Program Repair
Comments: To appear in 44th International Conference on Software Engineering (ICSE) 2022. The first two authors contributed equally and are joint "first authors"
Subjects: Software Engineering (cs.SE)
[17]  arXiv:2109.02312 (replaced) [pdf, other]
Title: Linear-time Temporal Logic guided Greybox Fuzzing
Comments: To appear in International Conference on Software Engineering (ICSE) 2022
Subjects: Software Engineering (cs.SE)
[18]  arXiv:2109.14326 (replaced) [pdf, other]
Title: DeepAnalyze: Learning to Localize Crashes at Scale
Subjects: Software Engineering (cs.SE); Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
[19]  arXiv:2110.10234 (replaced) [pdf, other]
Title: Collaboration Challenges in Building ML-Enabled Systems: Communication, Documentation, Engineering, and Process
Comments: 22 pages, 10 figures, 5 tables
Subjects: Software Engineering (cs.SE); Machine Learning (cs.LG)
[ total of 19 entries: 1-19 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2112, contact, help  (Access key information)