{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T16:52:43Z","timestamp":1753894363829,"version":"3.41.2"},"reference-count":0,"publisher":"Centre pour la Communication Scientifique Directe (CCSD)","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"abstract":"<jats:p xml:lang=\"en\">Some strings -the texts- are assumed to be randomly generated, according to a probability model that is either a Bernoulli model or a Markov model. A rare event is the over or under-representation of a word or a set of words. The aim of this paper is twofold. First, a single word is given. One studies the tail distribution of the number of its occurrences. Sharp large deviation estimates are derived. Second, one assumes that a given word is overrepresented. The distribution of a second word is studied; formulae for the expectation and the variance are derived. In both cases, the formulae are accurate and actually computable. These results have applications in computational biology, where a genome is viewed as a text.<\/jats:p>","DOI":"10.46298\/dmtcs.310","type":"journal-article","created":{"date-parts":[[2021,8,23]],"date-time":"2021-08-23T21:23:38Z","timestamp":1629753818000},"source":"Crossref","is-referenced-by-count":0,"title":["Rare Events and Conditional Events on Random Strings"],"prefix":"10.46298","volume":"Vol. 6 no. 2","author":[{"given":"Mireille","family":"R\u00e9gnier","sequence":"first","affiliation":[{"name":"Algorithms"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4484-4996","authenticated-orcid":false,"given":"Alain","family":"Denise","sequence":"additional","affiliation":[{"name":"Laboratoire de Recherche en Informatique"}]}],"member":"25203","published-online":{"date-parts":[[2004,1,1]]},"container-title":["Discrete Mathematics &amp; Theoretical Computer Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dmtcs.episciences.org\/310\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dmtcs.episciences.org\/310\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,20]],"date-time":"2023-06-20T19:36:06Z","timestamp":1687289766000},"score":1,"resource":{"primary":{"URL":"https:\/\/dmtcs.episciences.org\/310"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2004,1,1]]},"references-count":0,"URL":"https:\/\/doi.org\/10.46298\/dmtcs.310","relation":{"is-same-as":[{"id-type":"uri","id":"https:\/\/hal.science\/hal-00959004v1","asserted-by":"subject"}]},"ISSN":["1365-8050"],"issn-type":[{"type":"electronic","value":"1365-8050"}],"subject":[],"published":{"date-parts":[[2004,1,1]]},"article-number":"310"}}