We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: Compression and the origins of Zipf's law for word frequencies

Abstract: Here we sketch a new derivation of Zipf's law for word frequencies based on optimal coding. The structure of the derivation is reminiscent of Mandelbrot's random typing model but it has multiple advantages over random typing: (1) it starts from realistic cognitive pressures (2) it does not require fine tuning of parameters and (3) it sheds light on the origins of other statistical laws of language and thus can lead to a compact theory of linguistic laws. Our findings suggest that the recurrence of Zipf's law in human languages could originate from pressure for easy and fast communication.
Comments: arguments have been improved; in press in Complexity (Wiley)
Subjects: Computation and Language (cs.CL); Data Analysis, Statistics and Probability (physics.data-an); Physics and Society (physics.soc-ph); Neurons and Cognition (q-bio.NC)
Journal reference: Complexity 21, 409-411 (2016)
DOI: 10.1002/cplx.21820
Cite as: arXiv:1605.01326 [cs.CL]
  (or arXiv:1605.01326v2 [cs.CL] for this version)

Submission history

From: Ramon Ferrer i Cancho [view email]
[v1] Wed, 4 May 2016 16:00:59 GMT (5kb)
[v2] Wed, 20 Jul 2016 15:14:10 GMT (6kb)

Link back to: arXiv, form interface, contact.