### Current browse context:

math.PR

### Change to browse by:

### References & Citations

# Mathematics > Probability

# Title: Matching strings in encoded sequences

(Submitted on 22 Mar 2019 (v1), last revised 10 Dec 2019 (this version, v2))

Abstract: We investigate the longest common substring problem for encoded sequences and its asymptotic behaviour. The main result is a strong law of large numbers for a re-scaled version of this quantity, which presents an explicit relation with the R\'enyi entropy of the source. We apply this result to the zero-inflated contamination model and the stochastic scrabble. In the case of dynamical systems, this problem is equivalent to the shortest distance between two observed orbits and its limiting relationship with the correlation dimension of the pushforward measure. An extension to the shortest distance between orbits for random dynamical systems is also provided.

## Submission history

From: Rodrigo Lambert [view email]**[v1]**Fri, 22 Mar 2019 17:46:15 GMT (32kb)

**[v2]**Tue, 10 Dec 2019 17:58:58 GMT (24kb)

Link back to: arXiv, form interface, contact.