References & Citations
Computer Science > Computer Vision and Pattern Recognition
Title: ScreenSeg: On-Device Screenshot Layout Analysis
(Submitted on 16 Apr 2021 (v1), last revised 21 Apr 2021 (this version, v2))
Abstract: We propose a novel end-to-end solution that performs a Hierarchical Layout Analysis of screenshots and document images on resource constrained devices like mobilephones. Our approach segments entities like Grid, Image, Text and Icon blocks occurring in a screenshot. We provide an option for smart editing by auto highlighting these entities for saving or sharing. Further this multi-level layout analysis of screenshots has many use cases including content extraction, keyword-based image search, style transfer, etc. We have addressed the limitations of known baseline approaches, supported a wide variety of semantically complex screenshots, and developed an approach which is highly optimized for on-device deployment. In addition, we present a novel weighted NMS technique for filtering object proposals. We achieve an average precision of about 0.95 with a latency of around 200ms on Samsung Galaxy S10 Device for a screenshot of 1080p resolution. The solution pipeline is already commercialized in Samsung Device applications i.e. Samsung Capture, Smart Crop, My Filter in Camera Application, Bixby Touch.
Submission history
From: Manoj Goyal [view email][v1] Fri, 16 Apr 2021 11:59:13 GMT (392kb,D)
[v2] Wed, 21 Apr 2021 12:28:00 GMT (392kb,D)
Link back to: arXiv, form interface, contact.