Two examples: factoring and GCDs
The classical computers that exist today are incredibly fast, and their speed seems to be ever increasing. For this reason, some might be inclined to believe that computers are so fast that no computational problem is beyond their reach.
This belief is false. Some computational problems are so inherently complex that, although there exist algorithms to solve them, no computer on the planet Earth today is fast enough to run these algorithms to completion on even moderately sized inputs within the lifetime of a human — or even within the lifetime of the Earth itself.
To explain further, let's introduce the integer factorization problem.
Integer factorization
Input: an integer
Output: the prime factorization of
By the prime factorization of we mean a list of the prime factors of and the powers to which they must be raised to obtain by multiplication. For example, the prime factors of are and and to obtain we must take the product of to the power and to the power
Up to the ordering of the prime factors, there is only one prime factorization for each positive integer which is a fact known as the fundamental theorem of arithmetic.
A few simple code demonstrations in Python will be helpful for further explaining integer factorization and other concepts that relate to this discussion. The following imports are needed for these demonstrations.
import math
from sympy.ntheory import factorint
The factorint
function from the SymPy
symbolic mathematics package for Python solves the integer factorization problem for whatever input we choose. For example, we can obtain the prime factorization for 12, which naturally agrees with the factorization above.
N = 12
print(factorint(N))
Output:
{2: 2, 3: 1}
Factoring small numbers like is easy, but when the number to be factored gets larger, the problem becomes more difficult.
For example, running factorint
on a significantly larger number causes a short but noticeable delay on a typical personal computer.
N = 3402823669209384634633740743176823109843098343
print(factorint(N))
Output:
{3: 2, 74519450661011221: 1, 5073729280707932631243580787: 1}
For even larger values of things become impossibly difficult, at least as far as we know. For example, the RSA Factoring Challenge, which was run by RSA Laboratories from 1991 to 2007, offered a cash prize of $100,000 to factor the following number, which has 309 decimal digits (or 1024 bits when written in binary). The prize for this number was never collected and its prime factors remain unknown.
RSA1024 = 135066410865995223349603216278805969938881475605667027524485143851526510604859533833940287150571909441798207282164471551373680419703964191743046496589274256239341020864383202110372958725762358509643110564073501508187510676594629205563685529475213500852879416377328533906109750544334999811150056977236890927563
print(RSA1024)
Output:
135066410865995223349603216278805969938881475605667027524485143851526510604859533833940287150571909441798207282164471551373680419703964191743046496589274256239341020864383202110372958725762358509643110564073501508187510676594629205563685529475213500852879416377328533906109750544334999811150056977236890927563
We need not bother running factorint
on RSA1024, it wouldn't finish within our lifetimes.
The fastest known algorithm for factoring large integers is known as the number field sieve. As an example of this algorithm's use, the RSA challenge number RSA250, which has 250 decimal digits (or 829 bits when written in binary), was factored using the number field sieve in 2020. The computation required thousands of CPU core-years, distributed across tens of thousands of machines around the world. Here we can appreciate this effort by checking the solution.
RSA250 = 2140324650240744961264423072839333563008614715144755017797754920881418023447140136643345519095804679610992851872470914587687396261921557363047454770520805119056493106687691590019759405693457452230589325976697471681738069364894699871578494975937497937
p = 64135289477071580278790190170577389084825014742943447208116859632024532344630238623598752668347708737661925585694639798853367
q = 33372027594978156556226010605355114227940760344767554666784520987023841729210037080257448673296881877565718986258036932062711
print(RSA250 == p * q)
Output:
True
The security of the RSA public-key cryptosystem is based on the computational difficulty of integer factoring, in the sense that an efficient algorithm for integer factoring would break it.
Next let's consider a related but very different problem, which is computing the greatest common divisor (or GCD) of two integers.
Greatest common divisor (GCD)
Input: nonnegative integers and at least one of which is positive
Output: the greatest common divisor of and
The greatest common divisor of two numbers is the largest integer that evenly divides both of them.
This problem is easy to solve with a computer — it has roughly the same computational cost as multiplying the two input numbers together.
The gcd
function from the Python math
module computes the greatest common divisor of numbers that are considerably larger than RSA1024 in the blink of an eye. (In fact, RSA1024 is the GCD of the two numbers in this example.)
N = 4636759690183918349682239573236686632636353319755818421393667064929987310592347460711767784882455889983961546491666129915628431549982893638464243493812487979530329460863532041588297885958272943021122033997933550246447236884738870576045537199814804920281890355275625050796526864093092006894744790739778376848205654332434378295899591539239698896074
M = 5056714874804877864225164843977749374751021379176083540426461360945653967249306494545888621353613218518084414930846655066495767441010526886803458300440345782982127522212209489410315422285463057656809702949608368597012967321172325810519806487247195259818074918082416290513738155834341957254558278151385588990304622183174568167973121179585331770773
print(math.gcd(N, M))
Output:
135066410865995223349603216278805969938881475605667027524485143851526510604859533833940287150571909441798207282164471551373680419703964191743046496589274256239341020864383202110372958725762358509643110564073501508187510676594629205563685529475213500852879416377328533906109750544334999811150056977236890927563
This is possible because we have very efficient algorithms for computing GCDs, the most well-known of which is Euclid's algorithm, discovered over 2,000 years ago.
Could there be a fast algorithm for integer factorization that we just haven't discovered yet, allowing large numbers like RSA1024 to be factored in the blink of an eye? The answer is yes. Although we might expect that an efficient algorithm for factoring as simple and elegant as Euclid's algorithm for computing GCDs would have been discovered by now, there is nothing that rules out the existence of a very fast classical algorithm for integer factorization, beyond the fact that we've failed to find one thus far. One could be discovered tomorrow — but don't hold your breath. Generations of mathematicians and computer scientists have searched, and factoring numbers like RSA1024 remains beyond our reach.