\documentclass[10pt]{article}

\setlength{\textwidth}{6.5in}
\setlength{\oddsidemargin}{0in}
\setlength{\evensidemargin}{0in}

\usepackage{amsmath,amscd,amsthm,amsfonts,amssymb,enumerate}

\theoremstyle{definition}
\newtheorem{rem}{Remark}
\newtheorem{com}{Comment}
\newtheorem{defn}{Definition}
\newtheorem{exam}{Example}
\newtheorem{alg}{Algorithm}
\theoremstyle{plain}
\newtheorem{thm}{Theorem}
\newtheorem{lem}{Lemma}
\newtheorem{cor}{Corollary}
\newtheorem{prop}{Proposition}

\renewcommand{\P}{\mathbb{P}^1}
\renewcommand{\k}{\kappa}
\renewcommand{\H}{\mathbb{H}}
\newcommand{\F}{\mathbb{F}}
\newcommand{\M}{\mathfrak{M}}
\newcommand{\R}{\mathbb{R}}
\newcommand{\C}{\mathbb{C}}
\newcommand{\CC}{\mathcal{C}}
\newcommand{\Z}{\mathbb{Z}}
\newcommand{\Q}{\mathbb{Q}}
\newcommand{\G}[1]{\Gamma(#1)}
\newcommand{\GO}[1]{\Gamma_0(#1)}
\newcommand{\Gb}[1]{\bar{\Gamma}(#1)}
\newcommand{\PSL}[1]{\mathbf{PSL}_2(#1)}
\newcommand{\z}{\zeta}
\renewcommand{\t}{\theta}
\renewcommand{\r}{\rho}
\newcommand{\sfrac}{\genfrac{}{}{0pt}{}{}{+}}
\DeclareMathOperator{\deck}{Deck}
\DeclareMathOperator{\aut}{Aut}
\DeclareMathOperator{\id}{id}
\DeclareMathOperator{\gal}{Gal}
\DeclareMathOperator{\ord}{ord}
\DeclareMathOperator{\GL}{GL}
\DeclareMathOperator{\Hom}{Hom}
\DeclareMathOperator{\End}{End}
\DeclareMathOperator{\im}{Im}
\author{Bryden R. Cais}
\title{Math 575}
\begin{document}

In this lecture we will show how elliptic curves over finite fields may be used to factor large composite numbers.
More specifically, we illustrate a simplified version of Lenstra's algorithm, which exploits the group structure
on an elliptic curve to produce a factor of a given number.  It is important to keep in mind that this is a \textit{probabilistic}
algorithm in that it is not guaranteed to produce a nontrivial factor of a given composite number in any finite amount of time.
However, ``in general'' one can expect the algorithm to produce a nontrivial factor given the ``right'' circumstances,
and it is significant in its own right that elliptic curves have this application.

\section{Computational complexity considerations}

The (efficient) factoring of large numbers is a problem of central interest today.  Many encryption schemes (most notably RSA)
rely on the tested maxim that factoring takes a lot of time.  Indeed, the most ``obvious'' factoring method--trial division--
is pretty bad because one has to perform about $\sqrt{n}/\log{\sqrt{n}}$ divisions (roughly the number of primes less than
$\sqrt{n}$).  What we would really like is a factoring algorithm that works in something like $\log n$ operations, but this may
be too much to hope for.  When considering problems of computational complexity, algorithms are usually divided into two classes.
There are those algorithms (like trial division, above) that take about $n^a$ operations (for some fixed positive a), and there are those
algorithms that can be executed in $C\log n$ operations for some fixed constant $C$.  The former algorithms are said to take
\textbf{exponential time}, while the latter take \textbf{polynomial time}.  In general, polynomial time algorithms are considered
efficient, while exponential time algorithms are considered inefficient.

Since we will be investigating computational algorithms that make heavy use of Euclid's algorithm and arithmetic modulo $n$
it is good to know that these computations are quite efficient.  Indeed, it is not difficult to
determine \cite{NZM}[pg. 15] that in practice, the GCD of two integers $b,c$ can be computed in about $3\log c$ iterations
of Euclid's algorithm.  Since each iteration of the algorithm requires only division and subtraction, we see that
Euclid's algorithm works in polynomial time.  We shall also need to compute powers $a^d$ of some integer $a$ modulo $n$.
In order to implement this efficiently, we compute $a^d$ by repeated squaring, working modulo $n$.  For example,
if we want to compute $3^{20}\mod 50$ we compute
\begin{align*}
3^2&\equiv 9\mod 50\\
3^4&\equiv 9^2\equiv 31\mod 50\\
3^8&\equiv 31^2\equiv 11\mod 50\\
3^{16}&\equiv 11^2\equiv 21\mod 50\\
3^{20}&\equiv 3^{16}\cdot 3^4\equiv 21\cdot 31\equiv 1\mod 50.
\end{align*}
Since we are really working with the binary expansion of $d$, the number of steps to compute $a^d\mod n$ is roughly the
number of binary digits of $d$, or about $\log d$, so this is a polynomial time algorithm.

\section{Pollard's $p-1$ method}

In order to motivate Lenstra's algorithm for factoring using elliptic curves, we take a close look at another, simpler
factoring algorithm.

Let $n$ be some integer with a prime factor $p$.
The basis of Pollard's algorithm is the assumption that it is ``likely'' that
the prime factors of $p-1$ are small in comparison to $p$.  One fixes a bound, $B$, as a guess for the size of the largest
possible prime factor of $p-1$, and forms the product
$$k=\prod_{q\leq B} q^{\left\lfloor \log_q B\right\rfloor},$$
which is taken over all primes $q\leq N$.
The idea is that $k$ is very divisible by all the small primes that we expect to divide $p-1$, and hence we expect
$p-1$ to divide $k$.  Fermat's theorem then tells us that we have
$$p|a^k-1$$ for any nonzero $a\mod p$ and hence that $p|d$ where
$$d=\gcd (a^k-1,n),$$ so that we have a nontrivial factor $d$ of $n$.  Since our goal is to compute
$\gcd (a^k-1,n)$, we only care about the value of $a^k-1$ modulo $n$, and we have seen that this is a ``cheap'' operation
in terms of computational time.  Similarly, computing the GCD by Euclid's algorithm is also inexpensive.

\begin{exam}
    Using Pollard's method, factor $n=6380651$.
\end{exam}

We try $a=2$, $B=9$ so that
$$k=2^3\cdot 3^2\cdot 5\cdot 7=2520=2^{12}+2^8+2^7+2^6+2^4+2^3.$$
We readily compute (by repeated squaring, working modulo 6380651)
\begin{align*}
    2^1&=2 & 2^2 &=4 & 2^4 &=16 & 2^8 &=256\\
    2^{16} & =65536 & 2^{32} &=789173 & 2^{64} &=4202423 & 2^{128} &= 5994431\\
2^{256} & =5409973 & 2^{512} &=4188467 & 2^{1024} &=440743 & 2^{2048} &= 1853005,
\end{align*}
so that
$$2^k\equiv 256\cdot 65536 \cdot 420243\cdot 5994431\cdot 5409973 \cdot 1853005\equiv 6347879\mod 6380651. $$
We now use Euclid's algorithm to determine the GCD of $n=6380651$ and $2^k-1=6347879$.
We have
\begin{align*}
    6380651 &=6347878+32773\\
    6347879 &=193\cdot 32773+22689\\
    32773   &= 22689 +10084\\
    22689 &= 2\cdot 10084+2521\\
    10084 &= 4\cdot 2521+0,
\end{align*}
so that $2521|6380651.$  This leads to the factorization
$$6380651=2521\cdot 2531.$$

In many instances, pollard's method is quite efficient.  In some cases, though, notably when all the $p$ factors of $n$
have $p-1$ divisible by large primes, it can become very slow, as the next example illustrates.

\begin{exam}
    Let $n=491389$.  It so happens that $n=383\cdot 1283$ and
    $382=2\cdot 191$ while $1282=2\cdot 641$.  Since both $191$ and $641$ are primes, we would have to take $B=191$
    before obtaining a nontrivial factor of $n$ by Pollard's method.  Needless to say, this would involve a lot of computation.
\end{exam}

The key feature of Pollard's method is that it relies on the structure of the multiplicative groups
$$(\Z/(p))^{\times}$$ for factors $p$ of the number $n$ that we want to factor.  This is a major limiting factor in the flexibility and
efficiency of the algorithm since once $n$ is specified, the groups $\Z/(p)$ that the algorithm exploits are fixed, and if it so happens that
all prime factors $p$ of $n$ have $p-1$ divisible by some large prime, then the computations involved are impractical and we are stuck.
Lenstra's algorithm sidesteps this difficulty by using the group of points on an elliptic curve to factor $n$, and as we shall see,
there are many different groups for a given $n$ corresponding to different elliptic curves.

\section{Lenstra's Algorithm}

Suppose that we want to factor the composite integer $n$.
Let $E:=E(q,r)$ be an elliptic curve over $\F_p$ given by the equation
$$y^2=x^3+qx+r,$$
and let $P=(x,y)$ be any point on $E(q,r)$.  Since the number of points $N_p:=N_p(q,r)$ of $E(q,r)$ over $\F_p$ is finite,
any point of $E$ has order dividing $N_p$.
In particular, if $N_p|k$ then we have $kP=O$.  As with Pollard's method, we make a guess as to the primes
that can divide $N_p(q,r)$ and form an appropriate $k$ that is highly divisible by these primes.  We are hoping that
we will have $N_p(q,r)|k$.  We then work modulo $n$ and attempt to compute $kP$ (in a manner that we shall describe).
Either everything goes smoothly or we are unable to continue the process because for some $m|k$ we have $mP=O$ and
when we try to compute $mP$, we are trying to find the coordinates for the point at infinity, which we think of as having
infinite $y$ coordinate.  If this latter case happens, then we will be trying to ``divide by 0'' modulo $n$, or equivalently,
we will be trying to invert some $d$ with $(d,n)\neq 1$.  We have then found a nontrivial factor of $n$.  Otherwise, we change
the elliptic curve and initial point and repeat the process.  We summarize this formally in the following algorithm.

\begin{alg}[Lenstra] Let $n$ be a composite number that we want to factor.

\begin{enumerate}
    \item Pick some $B>0$ and let $\pi$ be the largest prime less than $\sqrt{n}$ so that any prime factor of $n$
    is at most $\pi$.

    \item Set $M=\lfloor \pi+1+2\sqrt{\pi} \rfloor$.  this may seem somewhat arbitrary, but in fact, $M$ is an upper
    bound for the number of points on $E(q,r)$ over $\F_p$ for any $p|n$.  This is essentially a famous theorem of Hasse.

    \item Set
    $$k=\prod_{q\leq B} q^{\lfloor\log_q M\rfloor},$$
    where the product is taken over all primes at most $B$.  We are picking $k$ so that it will be likely that $N_p|k$
    for some $p|n$.  The larger we take $B$, the more likely this is, on the other hand, the larger
    we take $B$, the more computations we will have to do.

    \item Pick an elliptic curve $E(q,r)$ together with a point $P=(x,y)$ of $E(q,r)$ (say over $\Q$).  We must first check
    that $E$ is nonsingular over $\F_p$ for every prime $p|n$.  This amounts to the condition $(n,4q^3+27r^2)=1$.  If this fails,
    we either have a nontrivial factor of $n$, in which case we are done,
    or $n|4q^3+27r^2,$ in which case we pick a different elliptic curve.

    \item Attempt to compute the coordinates of $kP$ modulo $n$.  Any time a division is necessary--for example in computing
    the inverse of $y$ modulo $n$ in the formula
    $$\left(\frac{3x^2+q}{2y}\right)-2x,$$ we use the Euclidean algorithm.  One computes $kP$ by writing
    $$k=l_1l_2\cdots l_s,$$ where $l_i$ are primes and $l_1\leq l_2\leq\cdots\leq l_s$, and then computing $l_1P,l_2(l_1P),\ldots,kP$
    using the method of repeated squaring and the binary expansion of each $l_i$, just as in Pollard's method.

    \item Either we are able to compute $kP$ modulo $n$ or at some point in the process we cannot divide modulo $n$ because
    for some $t\leq s$ and some $p|n$ we have $N_p|l_1l_2\cdots l_t$, so that if $L_t=l_1l_2\cdots l_t$ then $L_tP=O$ modulo $p$
    and we are trying to compute the inverse of some $a$ modulo $n$ with $(a,n)>1$.  The Euclidean algorithm then gives us a nontrivial
    factor of $n$, as desired.  If we are in the former case, or if it so happens that $a=n$ then we pick a different elliptic curve
    and begin again.  Alternately, it may be necessary to choose a larger value of $B$.
\end{enumerate}

The ability to just pick another elliptic curve and begin again is what makes Lenstra's method so much more flexible than pollard's algorithm.
We now look at some examples of Lenstra's algorithm.

\begin{exam}
    Using Lenstra's algorithm, find a nontrivial factor of $1449533$.
\end{exam}
We use the family of elliptic curves $E(a,-a)$, which all have the point $P=(1,1)$ on them.
Since $n=1449533,$ we have $\pi=1201$ so that $M=1271$.  We begin by trying $B=3,a=1$.  Then we have
$k=2^{10}\cdot 3^6.$  Working modulo $n$ we compute
$$2P,4P,\ldots, 2^{10}P,3(2^{10}P),\ldots kP$$
using the Euclidean algorithm whenever we must calculate inverses modulo $n$.
We find
\begin{align*}
    P&=(1,1) & 2P &= (2,1449530)\\
        2^2P&=(765032,208035) & 2^3P &= (35660,378226)\\
            2^4P&=(1290470,840473) & 2^5P &= (663017,645623)\\
                2^6P&=(1051890,1120876) & 2^7P &= (268555,1155381)\\
                    2^8P&=(774755,989803) & 2^9P &= (851862,449326)\\
                        2^{10}P&=(382293,1119221) & 2^{10}3P &= (326716,22564)\\
                            2^{10}3^2P&=(938879,748249) & 2^{10}3^3P &= (1013556,385270)\\
                            2^{10}3^4P&=(179101,1077013) & 2^{10}3^5P &= (432909,801568)\\
\end{align*}
However, when we try to compute $kP$ we run into problems.  Let $Q=2^{10}3^5 P=(432909,801568)$.  Then we have
$$2Q=(1093968,1064687)$$
and therefore when we try to compute
$$2Q+Q=(1093968,1064687)+(432909,801568)$$
we must compute the inverse of
$$1093968-432909=661059\mod 1449533.$$
Using Euclid's algorithm we have
\begin{align*}
    1449533 &= 2\cdot 661059+127415\\
    661059 &= 5\cdot 127415+23984\\
    127415 &= 5\cdot 23984+7495\\
    23984 &= 3\cdot 7495 + 1499\\
    7495 &=5\cdot 1499+0,
\end{align*}
so that $1499|1449533.$  This leads to the factorization
$$1449533=967\cdot 1499.$$
\end{alg}
\begin{thebibliography}{10}
    \bibitem{NZM} Niven, I., Zuckerman, H., and Hugh Montgomery.  ``An Introduction to the Theory of Numbers.''  John Wiley \& Sons, Inc.
    New York, 1991.
\end{thebibliography}
\end{document}
