We gratefully acknowledge support from
the Simons Foundation and member institutions.

Performance

Authors and titles for cs.PF in Nov 2019

[ total of 40 entries: 1-39 | 40 ]
[ showing 39 entries per page: fewer | more | all ]
[1]  arXiv:1911.00119 [pdf, other]
Title: ALERT: Accurate Learning for Energy and Timeliness
Subjects: Performance (cs.PF); Machine Learning (cs.LG)
[2]  arXiv:1911.00329 [pdf, other]
Title: A Data-Assisted Reliability Model for Carrier-Assisted Cold Data Storage Systems
Comments: 14 pages, 8 figures, accepted to Elsevier Reliability and Safety Journal, 2019 (unedited)
Subjects: Performance (cs.PF)
[3]  arXiv:1911.02430 [pdf, other]
Title: Graph-based Approach for Buffer-aware Timing Analysis of Heterogeneous Wormhole NoCs under Bursty Traffic
Comments: 21 pages, 22 figures, 5 tables
Subjects: Performance (cs.PF)
[4]  arXiv:1911.02987 [pdf, other]
Title: The Pitfall of Evaluating Performance on Emerging AI Accelerators
Subjects: Performance (cs.PF); Machine Learning (cs.LG)
[5]  arXiv:1911.03282 [pdf, other]
Title: nanoBench: A Low-Overhead Tool for Running Microbenchmarks on x86 Systems
Subjects: Performance (cs.PF)
[6]  arXiv:1911.07449 [pdf, other]
Title: Understanding Open Source Serverless Platforms: Design Considerations and Performance
Journal-ref: Proceedings of the 5th International Workshop on Serverless Computing, Pages 37-42, 2019
Subjects: Performance (cs.PF); Distributed, Parallel, and Cluster Computing (cs.DC)
[7]  arXiv:1911.11642 [pdf, other]
Title: System Performance with varying L1 Instruction and Data Cache Sizes: An Empirical Analysis
Comments: 5 Figures and 3 Tables
Subjects: Performance (cs.PF); Hardware Architecture (cs.AR)
[8]  arXiv:1911.11852 [pdf, other]
Title: Rule Designs for Optimal Online Game Matchmaking
Subjects: Performance (cs.PF); Computational Complexity (cs.CC); Systems and Control (eess.SY)
[9]  arXiv:1911.12877 [pdf, other]
Title: GraphZero: Breaking Symmetry for Efficient Graph Mining
Subjects: Performance (cs.PF); Databases (cs.DB)
[10]  arXiv:1911.13027 [pdf, other]
Title: Using performance analysis tools for parallel-in-time integrators -- Does my time-parallel code do what I think it does?
Comments: 31 pages, 15 figures, CVS Proceedings of the 9th PinT Workshop
Subjects: Performance (cs.PF); Mathematical Software (cs.MS)
[11]  arXiv:1911.13074 [pdf, other]
Title: Efficient method for parallel computation of geodesic transformation on CPU
Comments: \c{opyright} 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Subjects: Performance (cs.PF); Distributed, Parallel, and Cluster Computing (cs.DC)
[12]  arXiv:1911.01258 (cross-list from cs.LG) [pdf, other]
Title: SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Network
Subjects: Machine Learning (cs.LG); Hardware Architecture (cs.AR); Neural and Evolutionary Computing (cs.NE); Performance (cs.PF)
[13]  arXiv:1911.02373 (cross-list from cs.DC) [pdf, other]
Title: KLARAPTOR: A Tool for Dynamically Finding Optimal Kernel Launch Parameters Targeting CUDA Programs
Comments: 10 pages. arXiv admin note: text overlap with arXiv:1906.00142
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)
[14]  arXiv:1911.02549 (cross-list from cs.LG) [pdf, other]
[15]  arXiv:1911.03011 (cross-list from cs.LG) [pdf, other]
Title: Adaptive Kernel Value Caching for SVM Training
Comments: Accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS)
Subjects: Machine Learning (cs.LG); Performance (cs.PF); Machine Learning (stat.ML)
[16]  arXiv:1911.03456 (cross-list from cs.DC) [pdf, other]
Title: Parallel Data Distribution Management on Shared-Memory Multiprocessors
Comments: arXiv admin note: text overlap with arXiv:1703.06680
Journal-ref: ACM Transactions on Modeling and Computer Simulation (TOMACS), Vol. 30, No. 1, Article 5. ACM, February 2020. ISSN: 1049-3301
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS); Performance (cs.PF)
[17]  arXiv:1911.04200 (cross-list from cs.CE) [pdf, other]
Title: Communication-Efficient Jaccard Similarity for High-Performance Distributed Genome Comparisons
Journal-ref: Proceedings of the 34st IEEE International Parallel and Distributed Processing Symposium (IPDPS'20), 2020
Subjects: Computational Engineering, Finance, and Science (cs.CE); Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF); Genomics (q-bio.GN)
[18]  arXiv:1911.04610 (cross-list from cs.LG) [pdf, other]
Title: XPipe: Efficient Pipeline Model Parallelism for Multi-GPU DNN Training
Comments: 9 pages
Subjects: Machine Learning (cs.LG); Performance (cs.PF)
[19]  arXiv:1911.04650 (cross-list from cs.DC) [pdf, other]
Title: Throughput Prediction of Asynchronous SGD in TensorFlow
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Performance (cs.PF)
[20]  arXiv:1911.04946 (cross-list from cs.LG) [pdf, other]
Title: Optimizing Deep Learning Inference on Embedded Systems Through Adaptive Model Selection
Comments: Accepted to be published at ACM TECS. arXiv admin note: substantial text overlap with arXiv:1805.04252
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)
[21]  arXiv:1911.05146 (cross-list from cs.DC) [pdf, other]
Title: HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow
Comments: 18 pages, 10 figures, Accepted, to be presented at ISC '20
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Performance (cs.PF)
[22]  arXiv:1911.05181 (cross-list from cs.LG) [pdf, other]
Title: 92c/MFlops/s, Ultra-Large-Scale Neural-Network Training on a PIII Cluster
Comments: SC '00: Proceedings of the 2000 ACM/IEEE Conference on Supercomputing
Journal-ref: ACM/IEEE SC 2000 Conference (SC00)
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF); Machine Learning (stat.ML)
[23]  arXiv:1911.06714 (cross-list from cs.DC) [pdf, other]
Title: Two-level Dynamic Load Balancing for High Performance Scientific Applications
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Instrumentation and Methods for Astrophysics (astro-ph.IM); Performance (cs.PF); Computational Physics (physics.comp-ph)
[24]  arXiv:1911.06922 (cross-list from cs.LG) [pdf, other]
Title: Benanza: Automatic $μ$Benchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF); Machine Learning (stat.ML)
[25]  arXiv:1911.07617 (cross-list from cs.NI) [pdf, other]
Title: Design and Implementation of Secret Key Agreement for Platoon-based Vehicular Cyber-Physical Systems
Comments: To be published in ACM Transactions on Cyber-Physical Systems (TCPS)
Subjects: Networking and Internet Architecture (cs.NI); Performance (cs.PF)
[26]  arXiv:1911.07738 (cross-list from cs.NI) [pdf, other]
Title: Profile-based Resource Allocation for Virtualized Network Functions
Comments: accepted in IEEE TNSM journal
Journal-ref: IEEE Transactions on Network and Service Management, 2019, Early Access
Subjects: Networking and Internet Architecture (cs.NI); Machine Learning (cs.LG); Performance (cs.PF); Machine Learning (stat.ML)
[27]  arXiv:1911.07967 (cross-list from cs.LG) [pdf, other]
Title: DLBricks: Composable Benchmark Generation to Reduce Deep Learning Benchmarking Effort on CPUs (Extended)
Subjects: Machine Learning (cs.LG); Performance (cs.PF); Software Engineering (cs.SE); Machine Learning (stat.ML)
[28]  arXiv:1911.08031 (cross-list from cs.DC) [pdf, other]
Title: The Design and Implementation of a Scalable DL Benchmarking Platform
Journal-ref: 2020 IEEE 13th International Conference on Cloud Computing (CLOUD), 414-425
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Performance (cs.PF); Machine Learning (stat.ML)
[29]  arXiv:1911.08779 (cross-list from cs.DC) [pdf, other]
Title: Characterizing Scalability of Sparse Matrix-Vector Multiplications on Phytium FT-2000+ Many-cores
Comments: Accepted to be published at IJPP
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computation and Language (cs.CL); Performance (cs.PF)
[30]  arXiv:1911.09034 (cross-list from cs.NI) [src]
Title: Rate Maximization in Vehicular uRLLC with Optical Camera Communications
Comments: This paper is updated and has been fully modified starting from the system model and solution schemes
Subjects: Networking and Internet Architecture (cs.NI); Performance (cs.PF); Signal Processing (eess.SP)
[31]  arXiv:1911.09512 (cross-list from cs.LG) [pdf, other]
Title: A Comparative Analysis of Forecasting Financial Time Series Using ARIMA, LSTM, and BiLSTM
Comments: 8 pages, 3 figures, 3 tables, 1 listing, IEEE BigData 2019
Subjects: Machine Learning (cs.LG); Computational Engineering, Finance, and Science (cs.CE); Performance (cs.PF); Machine Learning (stat.ML)
[32]  arXiv:1911.09925 (cross-list from cs.DC) [pdf, other]
Title: Gemmini: Enabling Systematic Deep-Learning Architecture Evaluation via Full-Stack Integration
Comments: To appear at the 58th IEEE/ACM Design Automation Conference (DAC), December 2021, San Francisco, CA, USA
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR); Machine Learning (cs.LG); Performance (cs.PF)
[33]  arXiv:1911.10735 (cross-list from cs.LG) [pdf, other]
Title: CAMUS: A Framework to Build Formal Specifications for Deep Perception Systems Using Simulators
Authors: Julien Girard-Satabin (TAU, LIST), Guillaume Charpiat (LRI, TAU), Zakaria Chihani (LIST), Marc Schoenauer (TAU)
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE); Performance (cs.PF)
[34]  arXiv:1911.11293 (cross-list from cs.CV) [pdf, other]
Title: Efficient Saliency Maps for Explainable AI
Comments: In submission to ECCV 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Performance (cs.PF); Robotics (cs.RO)
[35]  arXiv:1911.11592 (cross-list from cs.CR) [pdf, ps, other]
Title: Transaction Confirmation Time Prediction in Ethereum Blockchain Using Machine Learning
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG); Performance (cs.PF); Machine Learning (stat.ML)
[36]  arXiv:1911.12162 (cross-list from cs.DC) [pdf, other]
Title: Dynamically Provisioning Cray DataWarp Storage
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)
[37]  arXiv:1911.12898 (cross-list from cs.IT) [pdf, other]
Title: A PHY Layer Security Analysis of Uplink Cooperative Jamming-Based Underlay CRNs with Multi-Eavesdroppers
Comments: 34 pages, 7 figiures
Subjects: Information Theory (cs.IT); Performance (cs.PF)
[38]  arXiv:1911.00527 (cross-list from eess.AS) [pdf, other]
Title: Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Performance (cs.PF); Sound (cs.SD)
[39]  arXiv:1911.03062 (cross-list from physics.comp-ph) [pdf, other]
Title: Digital Blood in Massively Parallel CPU/GPU Systems for the Study of Platelet Transport
Journal-ref: Royal Society - Interface Focus (2020)
Subjects: Computational Physics (physics.comp-ph); Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)
[ total of 40 entries: 1-39 | 40 ]
[ showing 39 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2404, contact, help  (Access key information)