Collatz numbers

Russell Holt
June, 1996

Also called "hailstone" numbers..

These sequences are

Stated mathematically, a collatz sequence for any starting value n is

which simply means that given any number n, its "collatz" number depends on whether n is even or odd, either n/2 or 3n+1. If we call this function collatz, then for n, the next number is collatz(n), and the next one is collatz(collatz(n)), and so on.

For example, starting at 20, the sequence is

20 10 5 16 8 4 2 1 4 2 1 4 2 1 ...

because 20 is even, the next number is 20/2 = 10. Now because 10 is even, the next is 10/2 = 5. 5 is odd, so the next number is 5*3+1 = 16. etc. Simple.

Who cares?

The problem comes with these comments and questions:
  1. collatz(n) seems to always end in the following repeating sequence: 4,2,1,4,2,1,4,2,1 ... so we consider 1 to be the end, since arrival at 1 always results in this repetition. Does this happen for every integer?

  2. Given any number n, how long is the sequence? That is, how many numbers are in the sequence until we get to 1?

For example, the length of the sequence of collatz(378644) is only 28, but the length of collatz(18) is 21!

What is the correlation between the length of a sequence and its starting value, but more importantly, why does it seem to be so difficult to find in such a seemingly simple function?

In this paper I will demonstrate the self-similar structure of the collatz numbers, show the interconnectedness of all of the collatz sequences, and suggest ways to answer the main questions.

The Structure

In analzying sequences and their interconnectedness, observe the following two sequences:

  1. 13, 40, 20, 10, 5, 16, 8, 4, 2, 1
  2. 12, 6, 3, 10, 5, 16, 4, 2, 1

Notice the interesections here: both "pass through" the number 10, - they merge. The number 10 is sort of an "axis" node. When we're at any number in a sequence, one key question is "how did we get here? What are the possible ways that this number could have been produced?" Instead of generating the sequence by computing n/2 or 3n+1, we can look in the other direction, by computing 2n or (n-1)/3 - are they integers?

For example, consider the number 10: 2*10 = 20, clearly 20 is a collatz "parent" of 10 -- the even parent. How about (10-1)/3 ? This is 3, and so clearly 3 is the odd parent of 10.

What about another number from one of these sequences, say 5? Well, obviously 10 is a parent, (5*2) but what about (5-1)/3 = 4/3 -> not an integer: no.

Now how about another sequence: 160, 80, 40, 20, 10, 5, 16, ...

This fits in with the other two, but not in the same spot. It brances from 40, which we discover is also an axis node. Visually, we have so far (a few more nodes added for symmetry)

What happens if we continue this backward analysis, for every number in this tree? The big question #1, which was, does every collatz sequence end with 4,2,1,4,2,1...? is the same as asking the question: Is every integer in this tree?


These "axis" nodes give a good starting point for analysis. I will examine the structure around the axis nodes and find a pattern that they all fit. Then, The structure of the collatz sequences can be visualized as a tree structure, with the following three node patterns:

The first two, even/odd parents -> root, and even -> root have been seen before. The third one is a most interesting anomaly which occurs regularly in the pattern. See below.

This is the basic structure.


  1. Long even chains

    When a number is divisible by 3, it has no odd parents. This is because if n mod 3 = 0 (ie, n is divisible by 3), then (n+1)/3 mod 3 0 (a "potential" odd parent is not divisible by three). This number n has an even parent, 2n, which is also divisible by 3, as are all multiples of n. By induction, therefore, no parent of n has an odd parent.

  2. Symmetry I:

    Describes the relationship between any two axis nodes on the same level of the tree, which is the relationship between any two starting values whose collatz sequence is of the same length.

    Let e(n) denote the even parent of n, that is 2*n, and o(n) denote the odd parent of n, that is (n+1)/3. A "level" of the tree is the height from 1, that is, the length of a collatz sequence. For any two axis nodes a1 and a2, where:

    1. a1 is on the odd side of the tree
    2. a2 is on the even side of the tree
    3. a1 and a2 are on the same level
    4. a1 and a2 have both odd and even parents (neither a1 nor a2 is divisible by 3)
    Further, let odd(i) denote the ith odd number (the sequence 1,3,5,7,9,...) where odd(0) = odd(1) = 1 then e(a1) = o(a2) - odd(level-7).

    For example,