Saturday, 3 March 2012

Pilindrome

[Warning: this blog post contains mathematical notation.  Continue at your own risk]

[edit: It's by no means clear that the A(i) are mutually exclusive, which renders this proof dodgy - though you can consider the events B(i), that the sequence up to i is a palindrome but contains no shorter palindromes, which are mutually exclusive, and have (I believe) the same probability, yielding the same result]

I recently rediscovered the truly excellent Abstruse Goose, a webcomic in the same vein as XKCD - reflections on a life more nerdy.  Their (currently) latest comic piqued my interest - of course, π isn't a palindrome, since it's endless, but there are related questions that are intriguing.

I was particularly interested in determining whether π could be definitively stated to contain a palindromic elision - that is, some point at which, if you chopped it off there, the first digits of π (without decimal point) would read 314159265...562951413.  A quick google turned up the following question, involving a reply from someone who clearly knows more about the relevant areas of maths than I do - I never took measure theory, and number theory and probability weren't my strong points.  Still, I thought there would still be something interesting that I could tinker about with.

Since the question of π's normality[1] is apparently both important to the question, and an unsettled matter, I thought I'd transfer the question to a random sequence, since they're a bit more well-behaved.  Note that below I'm investigating the question of whether truncating this sequence at some point will generate a palindrome, rather than the somewhat-looser question of whether the sequence will contain a palindrome at some point[2].

However, the "answer" that I got out doesn't seem intuitively right (for one thing, I was expecting a series that converged to 1), so I'm by no means staking my mathematical reputation on this - if anyone with a better grasp on the mathematics spots an error, please, by all means point it out!

Since this blog is meant to be accessible to non-nerds, I've done my best to provide a line-by-line commentary below so those without a mathematical education can follow along.

(Note that I'm not counting single digits as palindromes here, because that's a) silly and b) boring)

1. Let xi be an infinite random sequence, xi ∈ [9] ∀i
2. Let a finite sequence y1, y2, ... yn-1, yn, be called palindromic if yi = yn-i ∀i

3. We seek the probability that ∃n s.t. xi≤n is palindromic
4. Let A(n) be the event that x1, x2, ..., xn-1, xn is palindromic.  Then we seek ℙ[j=2U A(j)]
5. ℙ[A(j)] = 10 -floor(j/2)
6. 
ℙ[j=2U A(j)] = j=2Σ10 -floor(j/2) = 2 j=1Σ10 -j
∴ Probability sought is 0.2̇


Apologies for the horrible horrible formatting - I haven't yet found a way to write LaTeX or similar on blogger, if anyone knows of one, please let me know!

Right, walkthrough time:

1. The subscript "i" is the index of the sequence.  For instance, x1 would be the 1st number in the sequence, x52 the 52nd, and so on.  [n] is a shorthand notation for "the set of numbers less than or equal to that number" - so [9] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}.  The symbol ∀ means "for all" - so this statement is saying "whichever number you choose for i, xi will be a whole number between 0 and 9 inclusive"
2. This is defining what we mean by "palindromic" in terms of mathematical notation.
3. That symbol ∃ means "there exists" - i.e., we are looking for the probability that there is a number n such that, if we chop off the sequence at the nth number, it will form a palindrome.
4. The double-struck P (this font is sometimes called "Blackboard Bold") stands for "probability".  The "U" symbol stands for "Union" (essentially, "combining" various events), and the sub- and super-script means that we are taking the union from j=2 to infinity.  That is, we are looking for the probability that any truncation of the sequence (of length two or more) will be a palindrome.
5. This is stating the probability of the truncation at the nth digit being a palindrome.  "floor" is a function that gives the smallest whole number less than or equal to the argument - so floor(2.5) is 2, but floor(3) is 3.  The explanation for this probability is as follows; in order for a sequence to be a palindrome, we need the second half of the numbers (rounded down) to match the first half.  The probability of any one number matching its partner is 1/10 (as there are 10 digits to choose from), so the probability of all of them matching is (1/10)floor(j/2) - and since (1/10) can be represented as 10-1, this gives the probability given above.
6. Putting a bar over a symbol or expression in maths is a common way of expressing "not/negation".  In this case, since A(n) is the probability of the n-truncated sequence being a palindrome, A(n) is the probability of it not being a palindrome.  If p is the probability of an event happening, then 1-p is the probability of that event not happening.
7. The spikey-E shape on the right-hand side here is called "Sigma".  It represents repeated addition - in this case, "add up all values of 10 -floor(j/2) from j=2 upwards".  We can do this because the probabilites of A(n) are mutually exclusive - that is, the probability of either one of them occurring is the addition of both of their probabilities.  Since floor(2n/2) = floor(2n+1/2), this is equal to twice the addition with the floor and division-by-two removed.
The dot over the 2 means "repeating" - that is, the probability is 0.22222...

So, in summary, the maths above, in which I can't find a problem, gives us a 0.2222... probability of any given random sequence of numbers having a palindromic.  In hindsight, and having benefitted from the added insight afforded by having to explain my working, I'm less uncomfortable with this answer than I was before - an answer of 0 would have been non-sensical, as clearly some random sequences exist with a palindromic truncation, but so would 1 as there "do" exist infinite sequences not containing palindromes.  In fact, the construction of palindromes is quite artificial, so you would expect them to be reasonably rare.  0.222... therefore seems a sensible value for this probability.

It's amazing how quickly I've lost my fluency with maths, and especially probability - I could still remember all the pertinent details and get it done, but with none of the fluidity that I used to have!  Definitely need to keep my brain in gear somehow...the iTunes U courses that have been recommended might be a good idea!

Anyway, I hope you've enjoyed this jaunt through maths...I'm at home right now, and our dinner guest has just appeared, so I must away!

[1] A "normal" number is one in which every possible digit (from 0 to 9, in the decimal system) occurs with the same density (i.e. has the same "chance" of occurring), every pair of digits has an equal density, every triple has an equal density, every string of four digits has an equal density, and so on.  Confusingly, the vast majority of numbers that we encounter outside of a maths degree are not normal, because they are either integers ("whole numbers"), or terminating decimals (e.g. 5.324, as opposed to a non-terminating decimal like 7.3333333...).  A little thought will show that neither of these can be normal, since they only have a finite number of digits to play with, and so they could not fulfil all of the conditions - trivially, it cannot be true that an integer with n digits contains all (n-1)-length digit strings with equal density, because there is only room for two such strings.

[2] Indeed, this is certain to happen in an infinite random sequence.  Consider the fact that dd is a palindrome for any digit d.  Then we seek the probability that the sequence contains any dd.  The probability that any digit pair is any dd is 1/10 (the first digit can be chosen freely, and there is a one in ten chance that the next digit matches), so the probability they do not match is thus 9/10.  The probability that not a single pair of digits in the sequence match is (9/10)*(9/10)*(9/10)*..., which tends to 0.  So the probability that the sequence contains a matching pair (which are a palindrome) is 1.

3 comments:

  1. Hey Jack!
    This is a nice question, and not one I think I can solve at the moment but I can say something about it.

    Firstly, the non-mutually exclusive thing is a problem, and whilst your B(i) would be mutually exclusive I think the probabilities would be different
    (Pr(B(2))=1/10,
    Pr(B(3))=1/10 -1/100 since nnn not allowed).

    You do get that Pr(U A(i))=<Sum(Pr(A(i)))=2/9, so that's a start.

    Also, Sum(Pr(A(i))) < infinity tells us that only finitely many A(i) can occur (Borel-Cantelli Lemma http://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma)

    So with probability 1, there will be a finite number of such palindromes. I think that's interesting in itself.

    I really like your blog, Pete (Lowth)

    ReplyDelete
  2. Ladies and Gentlemen, Pete Lowth, pretty much the definition of "someone with a better grasp of mathematics" :)

    I take some solace from the fact that I'd never heard of the Borel-Cantelli Lemma - though, I agree, the result you derive is rather nice!

    ReplyDelete