# Shor’s algorithm¶

Although any integer number has a unique decomposition into a product of primes, finding the prime factors is believed to be a hard problem. In fact, the security of our online transactions rests on the assumption that factoring integers with a thousand or more digits is practically impossible. This assumption was challenged in 1995 when Peter Shor proposed a polynomial-time quantum algorithm for the factoring problem. Shor’s algorithm is arguably the most dramatic example of how the paradigm of quantum computing changed our perception of which problems should be considered tractable. In this section we briefly summarize some basic facts about factoring, highlight main ingredients of Shor’s algorithm, and illustrate how it works using a toy factoring problem.

## Complexity of factoring¶

Suppose our task is to factor an integer with decimal digits. The brute force algorithm goes through all primes up to and checks whether divides . In the worst case, this would take time roughly , which is exponential in the number of digits . A more efficient algorithm known as the quadratic sieve attempts to construct integers such that is a multiple of . Once such are found, one checks whether have common factors with . The quadratic sieve method has asymptotic runtime exponential in . The most efficient classical factoring algorithm known as general number field sieve achieves an asymptotic runtime exponential in .

The exponential runtime scaling limits applicability of the classical
factoring algorithms to numbers with a few hundred digits; the world
record is (which took roughly 2,000 CPU years). In
contrast, Shor’s factoring algorithm has runtime *polynomial* in
. The version of the algorithm described below, due to Alexey
Kitaev, requires roughly qubits, and has runtime roughly
.

**Figure 1: classical vs. quantum factoring algorithms**

## Period finding¶

It has been known to mathematicians since the 1970s that factoring becomes easy if one can solve another hard problem: find a period of the modular exponential function. The period-finding problem is defined as follows. Given integers and , find the smallest positive integer such that is a multiple of . The number is called the period of modulo . Recall that in modular arithmetics the remainder of a division is called the value of modulo and denoted . For example, . Thus the period of modulo is the smallest positive integer such that . For example, suppose and . Then

That is, has period modulo . Note that computing the higher powers of would give rise to a periodic sequence: for any integer . Thus is the period of the modular exponential function . In general the period-finding problem is well-defined if and are co-prime (have no common factors).

## From factoring to period finding¶

Assume for a moment that we are given a period-finding machine that takes as input co-prime integers and outputs the period of modulo . Let us show how to use the machine to find all prime factors of . For simplicity, assume that has only two distinct prime factors:

First, pick a random integer between and and compute the greatest common divisor (gcd) . This can be done very efficiently using Euclid’s algorithm. If we are lucky, and have some common prime factors, in which case gcd equals or , so we are done. From now on, let us assume that gcd , that is, and are co-prime. Let be the period of modulo computed by the machine. Repeat the above steps with different random choices of until is even. It can be shown that a significant fraction of all integers has even period (see Table 1 for examples), so on average one needs only a few repetitions. At this point we have found some pair such that is even, and is the smallest integer such that is a multiple of . Let us use the identity

The above shows that is not a multiple of (otherwise the period of would be ). Assume for a moment that is not a multiple of . Then neither of the integers is a multiple of , but their product is. This is possible only if is a prime factor of and is a prime factor of (or vice versa). Thus we can find and by computing gcd ; see Table 1 for examples. In the remaining “unlucky” case, when is a multiple of , we give up and try a different integer . For example, is the only unlucky integer in Table 1. In general, it can be shown that the unlucky integers are not too frequent, so on average only two calls to the period-finding machine are sufficient to factor .

Table 1: period of integers mod

## Shor’s algorithm¶

Let us now show that a quantum computer can efficiently simulate the period-finding machine. As in the case of the Deutsch-Jozsa algorithm, we shall exploit quantum parallelism and constructive interference to determine whether a complicated function has a certain global property that cannot be learned by evaluating the function only at a few points. However, instead of detecting the property of being a balanced function, we seek to detect and measure periodicity of the modular exponentiation function. The fact that interference makes it easier to measure periodicity should not come as a big surprise. After all, physicists routinely use scattering of electromagnetic waves and interference measurements to determine periodicity of physical objects such as crystal lattices. Likewise, Shor’s algorithm exploits interference to measure periodicity of arithmetic objects.

Suppose we are
given co-prime integers . Our goal is to compute the period
of modulo , that is, the smallest positive integer
such that . The basic idea is to
construct a unitary operator that implements the modular
multiplication function . It can be shown that
eigenvalues of are closely related to the period of
. Namely, each eigenvalue of has a form
, where for some integer
. Furthermore, as we saw in the previous section, eigenvalues
of certain unitary operators can be measured efficiently using the phase
estimation algorithm. Unfortunately, inferring from the
measured eigenvalues of is only possible if the eigenvalues
are measured *exactly* (or with an exponentially small precision). For
example, factoring a 1000-digit number would require measuring the
eigenvalue of with a precision . Such
accuracy cannot be achieved by a direct application of the phase
estimation algorithm, as this would require too large a pointer
system. Here comes the main trick: we shall estimate the eigenvalue of
by applying the phase estimation algorithm to a family of
unitary operators with etc. We stop
at with .

Why does it work? The first observation is that all operators are integer powers of . Namely, if , then . This implies that the operators have the same eigenvectors. In particular, eigenvalues of the entire family can be measured simultaneously. Second, implementing is as easy as implementing - one just need to precompute the powers by the repeated squaring method. Finally, even if the eigenvalues of are measured with a poor precision (say 10%), each squaring of reduces the error in the estimated eigenvalue of by a factor of . Indeed, consider an eigenvector of with an eigenvalue . If , then the eigenvalue of is . If , then the eigenvalue of is , etc. Thus we can estimate with a constant precision (say 10%). We shall see that this is enough to estimate with a precision roughly . For example, one can achieve a precision by a sequence of less than lousy measurements of with an error of 10%. Furthermore, it can be shown that estimating a few randomly picked eigenvalues with a precision less than is enough to determine the period exactly (the idea is to find the best rational approximation to the estimate of using continued fractions).

In order to use the phase estimation algorithm, we need to construct a
quantum circuit implementing the modular multiplication operator. By
analogy with classical algorithms that can link standard library
functions, a quantum algorithm is allowed to call classical subroutines;
for example, a subroutine for computing the modular multiplication.
Importantly, before such classical subroutines are incorporated into a
quantum circuit, they must be transformed into a *reversible
form.* More precisely, a quantum algorithm can call a classical
subroutine only if it is compiled into a sequence of reversible logical
gates such as CNOT or Toffoli gate (in particular, the number of input
and output wires in each gate must be the same). The subroutine is
allowed to use a scratch memory similar to local variables used by the
standard library functions. However, once the subroutine is completed,
the scratch memory must be totally clean (say, all zeros). The reason is
that a quantum algorithm operates on coherent superpositions of
different classical states. Leaving information about the inputs or the
outputs in the scratch memory could potentially destroy quantum
coherence and prevent the algorithm from seeing interference between
different states. Since the notion of reversible classical circuits
plays an important role in the Shor’s algorithm and many other quantum
algorithms, below we briefly discuss methods for constructing such
circuits.

## Reversible classical circuits¶

An important insight made in 1973 by our IBM colleague Charles Bennett is that any classical computation can be transformed into a reversible form. How does it work? Suppose represents some classical computation that takes as input bit strings and outputs bit strings . The first observation is that the answer can be computed without erasing any intermediate data if we are allowed to use some extra memory. Indeed, let us write down an algorithm for computing and compile it into a sequence of elementary logical gates such as AND, OR, etc. For concreteness, assume that each gate has two input wires and one output wire. Let be the total number of gates. We shall extend the -bit memory storing the input by adding bits initialized by zeros. These extra bits will serve as a scratch memory for storing intermediate data. We shall write the output of the -th gate to the -th bit of the scratch memory and keep the values of the input bits. Once the computation is completed, the final answer is contained in some designated output register within the scratch memory. The remaining part of the scratch memory contains some “garbage” bit string (intermediate data). Below we illustrate how it works for the example when computes the 3-bit Majority function.

At this point the circuit is reversible as a whole, but its individual gates are still irreversible. The next step is to transform each gate into a reversible form. Consider as an example the AND gate with input wires and output wire such that . Let us define its reversible version R-AND. One of the output wires of R-AND must carry the output bit of the standard AND gate. To avoid losing information, R-AND must have at least two other output wires (note that in the case there are three possible input strings: ). The simplest version of R-AND has three input wires and three output wires as shown below.

Here is a dummy input wire and denotes XOR operation (addition modulo two). The gate expects to receive inputs with in which case . If then the output data bit if flipped. Note that all inputs of R-AND can be computed from its outputs since . Thus R-AND indeed acts reversibly (technically, R-AND realizes a permutation on the set of 3-bit strings). Note also that R-AND coincides with the Toffoli gate.

The same construction can be applied to any other gate with two input wires and one output wire. Namely, if a gate F computes some Boolean function , then its reversible version R-F would map inputs to outputs where ; see below. Note that applying R-F twice implements the identity gate, that is, R-F coincides with its own inverse.

Suppose the original circuit is described by a sequence of gates . Replace each gate by its reversible version - constructed above. We shall connect the dummy input wire of and its output wire to the -th bit of the scratch memory such that the gate always receives inputs with . The new circuit has input and output wires and is composed from reversible -bit gates. The final state generated by the circuit can be written as , where is the final answer stored in the output register somewhere within the scratch memory and represents “garbage” (intermediate data). Here we assumed that the scratch memory is initially clean (all zeros). Thus we have constructed a reversible circuit that maps to . The final step is to get rid of the garbage without erasing any information (which would render the circuit irreversible). A solution is to copy the answer to a clean ancillary register of bits and then “uncompute” by applying the circuit backwards in time. Below we sketch how this works.

Ignoring for simplicity all ancillary bits that are initialized and returned in the zero state, we obtained a reversible circuit on bits that maps input strings to output strings . In the special case when the is invertible, one can use similar tricks to construct a reversible circuit that maps input strings to output strings . In practice, one would never use the method described above, since it requires too large a scratch memory. Several optimization techniques for constructing reversible circuits have been proposed (such as uncomputing partial results more often and reusing scratch memory bits).

## Quantum circuits for modular multiplication¶

Suppose now that is the modular multiplication function. Let be the number of binary digits in . Using -bit strings to represent integers modulo , one can implement by a classical circuit composed of 3-bit reversible gates with input and output wires, as described above. The circuit may also use ancillary bits that are initialized and returned in the 0 state. The state-of-the-art implementation would require roughly gates and roughly ancillary bits. For simplicity, below we shall often ignore the ancillary bits. Let us convert to a quantum circuit by replacing each classical gate with its quantum counterpart. This is possible because, by construction, each gate of implements some permutation on the set of input bit strings . The corresponding quantum gate implements the same permutation on the set of basis states . We obtained a quantum circuit acting on a register of qubits that maps a basis state to . An example for is shown below. The period-finding algorithm requires modular multiplication circuits for , where .

Some basis states representing integers modulo .

Modular multiplication operator maps to mod . This quantum circuit implements (see Markov and Saeedi 2012).

## Controlled operations and phase estimation¶

Let be the modular multiplication operator. At this point we know how to construct a quantum circuit implementing , as well as repeated squares of such as , etc. We also know that eigenvalues of reveal information about the period of modulo . The final step is to measure the eigenvalues. For that we shall need a controlled version of . A controlled unitary operator is a quantum analog of classical conditional statements such as if-then-else. In general, suppose is a quantum circuit acting on qubits. A controlled version of is a unitary operator acting on a larger system control+target, where control is a single qubit and target is a register of qubits. Controlled- applies to the target register if the control qubit is state, and does nothing if the control qubit is .

Like their classical counterparts, controlled quantum operations are used in almost any quantum algorithm. We note that if can be realized by a short quantum circuit, then so does controlled-. Indeed, one can take the circuit realizing and replace each gate by its controlled version (with the same control qubit). The main distinction from the classical if-then-else construct is that the controlled qubit can be in a superposition of state . One could say that in the quantum world two branches of a conditional statement can be executed “at the same time”.

Consider now a special case when the target register is prepared in some state which is an eigenvector of , that is . The only difference between the two branches of the controlled- operation is the phase shift . In other words, the control qubit is mapped from to , while the target register remains in the state . Thus we can describe the action of controlled- on the composite system control+target by a single-qubit phase shift gate acting on the control qubit.

Below we focus on what happens with the control qubit only (keeping in mind that it is part of the larger system control+target). We shall measure the eigenvalue using a pair of phase estimation circuits shown below.

One can easily check that the probability of observing the measurement outcome is for the first circuit and for the second circuit. Keep in mind that represents the controlled- operator, so the circuit extracts information about the phase by measuring interference between two branches of controlled-, where one branch accumulates a phase factor and the other branch accumulates no phase. By repeating each circuit several times and collecting the measurement statistics, we can estimate the probabilities, which gives us an estimate . For concreteness, assume that we are willing to perform at most 100 measurements. The statistical error in our estimate of is thus roughly 10%. To factor a number with 1000 decimal digits, the phase has to be estimated with a very high precision . To this end, we shall perform the phase estimation for a family of unitary operators , where , etc. We stop at such that . Recall that we can efficiently implement for very large values of by classically computing and using the identity . Since all operators have the same eigenvector , we can do all phase estimations with the same target register (initialized in the eigenvector .

For simplicity, let us assume that the phase estimations are performed sequentially - in which case, only one control qubit is needed. The controlled- operator gives rise to a phase shift by angle on the control qubit. Thus we can estimate with a precision 10% by performing roughly 100 measurements. This gives an estimate of with a precision 5%. More precisely, since the phase lives on the unit circuit, we get a pair of candidate angles and such that one of them approximates with a precision 5% and the other is very far from (approximately by ). However, we have already estimated itself with a precision 10%. This is enough to select one of the candidate angles and . Applying this argument inductively several times shows that estimating with a constant precision (say, 10%) is enough to estimate with a precision roughly . Overall we would need approximately measurements, which translates to controlled modular multiplication operators. In general, scales as with some extra factors doubly logarithmic in . Since each controlled modular multiplication operator requires a quantum circuit of size , the overall complexity of the factoring algorithm scales as .

We have not explained yet how to initialize the target register in the eigenvector of Fortunately, all eigenvectors are equally good for our purposes: we are not interested in any particular eigenvalue, but rather want to measure a random eigenvalue drawn from the uniform distribution. Thus one can initialize the target register in an arbitrary state that has equal weight on each eigenvector of . For example, one can choose the initial state as the basis vector encoding the integer .

**Multi7x1Mod15**

**Multi7x7Mod15**

**Multi7x4Mod15**

**Multi7x13Mod15**

**PhaseEstimationTgate**