Prime Factorization in P Complexity?
\[\text{FACTORS}^* \in P\newline \tiny \text{*conditions may apply}\]
Lately, I've been taking a course called CSC373: Algorithm Design, Analysis, and Complexity. It introduces common programming techniques such as greedy algorithms, dynamic programming, and linear programming, that are ubiquitous in many of today's problem-solving applications. As the course comes to an end, we've been learning about \(\small NP\) and \(\small P\) complexity problems, and how to prove the \(\small NP\)-completeness of algorithms.
To understand \(\small P\) and \(\small NP\) complexity in simple terms, the answer to a \(\small P\) complexity problem can be obtained quickly with relation to the size of its input, while the answer to a \(\small NP\) complexity problem can be verified quickly. In more rigorous terms, \(\small P\) is the set of all decision problems that can be solved in polynomial time on a deterministic Turing machine, while \(\small NP\) is the set that can be solved in polynomial time on a non-deterministic Turing machine. The computers we currently use are all deterministic Turing machines, so problems that are in \(\small NP\) are not guaranteed to be solvable in polynomial time.
A common example of a \(\small NP\) problem is the \(\small \text{SUBSET-SUM}\) problem. That is, given a list of positive integers, can you select a subset of them that adds up to another positive integer \(\small k\)?
The solution to this problem can be easily verified: check if all numbers in the solution are in the list, and then add them up to see if they sum to \(\small k\). However, after many decades, we have yet to come up with a solution for the \(\small \text{SUBSET-SUM}\) problem with polynomial time complexity, i.e. proven that this decision problem is in \(\small P\).
Another example of a \(\small NP\) problem is the Prime Factorization problem. If you are given a positive integer, \(\small n\), determine all of its prime factors. In order to understand it in the context of \(\small P\) and \(\small NP\), we must reframe this into a decision problem, i.e. a "yes-no" question. We define an alternate problem, \(\small \text{FACTORS}\), that decides for positive integers \(\small n\) and \(\small k\), does \(\small n\) have a prime factor less than \(\small k\)? We can solve the original problem using this new decision problem along with the following algorithm:
- First step: perform binary search with different values of \(\small j \leq k\), and look for the boundary where \(\small (n,j)\) is false and \(\small (n,j+1)\) is true. This boundary, specifically \(\small j+1\), is guaranteed to be a prime factor of \(\small n\).
- Second step: Let \(\small n_1=\frac{n}{j+1}\), then repeat the algorithm using the input \(\small (n_1,k)\).
This algorithm keeps decreasing the \(\small n\) value until \(\small n = 1\), which indicates that we have obtained all possible factors, and thus prime factors of \(\small n\). This search takes up to \(\small O(\log{n} \cdot \log{n})\) iterations, before all prime factors are found. If \(\small \text{FACTORS}\) is in \(\small P\), then we have constructed an algorithm for Prime Factorization that runs in total polynomial time. However, to quote Wikipedia, "No algorithm has been published that can factor all integers in polynomial time". So it seems like \(\small \text{FACTORS}\) is not in \(\small P\)...
But is Wikipedia correct? "Yes, but actually no", and what Wikipedia specifically means is that no algorithm has been published in strongly-polynomial time. It turns out that most computer scientists calculate the time complexity for an input in terms of the number of bits required (explained why here). For example, the number \(\small 12\) would be interpreted as \(\small 1010\), which requires \(\small 4\) bits, so \(\small 12 \mapsto 4\). However, the way some may initially think about primes, is in terms of the size of the number itself, i.e \(\small 12 \mapsto 12\). If an algorithm has polynomial runtime with relation to this definition of input size, it is called weakly-polynomial time. This is generally only relevant when considering integers (as opposed to graphs, sets, items, etc.).
So how do we show Prime Factorization is in polynomial time? First, we could consider the definition of time complexity class \(\small P\) to include weakly-polynomial algorithms from above, but this isn't well accepted. In reality, this distinction is taken care of through a different class called pseudo-polynomial time. Even better is if we redefine the decision problem as determining the prime factors for a unary representation of \(\small n\), i.e. \(\small 3 \mapsto 111, 5 \mapsto 11111\), which forces the size of the input to equal the actual number itself. This (re)definition allows us to rigorously claim polynomial time!
Here is some code that can implement \(\small \text{FACTORS}\) in \(\small P\) for a unary representation of \(\small n\):
# Naive Algorithm for Decision Problem FACTORS
def factors(n:int, k:int) -> bool:
for i in range(2, k + 1):
# If the remainder is 0
if n % i == 0:
return True
# No more possible factors
return False
In conclusion, Prime Factorization can be in polynomial time!