# Basics of Cryptography Part I: RSA Encryption and Decryption

Cryptography and computer security have always been side interests for me. Recently, while I was reading about RSA encryption, I thought about how well I understood the algorithm. For me, the true test of understanding of something is always whether I can code it from scratch. While this goes against the "don't reinvent the wheel" principle of software development—a principle which I hold very dear when it comes to professional work, reinventing the wheel can be a great way to explore and learn more.

What follows is a basic walk-through of RSA encryption, accompanied by a simple (but relatively efficient) Python implementation of it.

Some basic group theory and number theory knowledge is needed. I provide some of the basics of the required group theory in the appendix. Most of it is there for the mathematically minded, and can be skipped by anyone not too interested in the mathematical proofs.

## An Overview of the RSA Cryptosystem

The RSA cryptosystem is an asymmetric cryptosystem that relies on the following intuitive observations, where by easy we mean asymptotically fast algorithms for the mentioned problem exist, and by difficult we mean no asymptotically fast algorithm is known:

1. It is easy to find two distinct large primes $$p$$ and $$q.$$
2. It is easy to multiply two large primes together, calculating $$n=pq.$$
3. However, given $$n=pq,$$ it is difficult to factor out $$p$$ and $$q.$$
4. Given $$a, n$$ and $$e$$, with $$0<a<n$$ and $$e > 1,$$ calculating $$a^e \pmod{n}$$ is easy.
5. But inversely, given $$a^e \pmod{n},$$ $$e$$ and $$n,$$ it is difficult to calculate $$a.$$
6. However, if the multiplicative inverse of $$e$$ modulo $$\phi(n)=(q-1)(p-1),$$ labelled $$d,$$ is given, then the previous calculation is easy. (The multiplicative inverse satisfies $$e \cdot d = 1 \pmod{\phi(n)}$$.)
7. And finally, calculating $$d$$ from only $$n$$ and $$e$$ is difficult, but easy if the factorization of $$n=pq$$ is known.

Putting all of this together, we get a scheme for an asymmetric cryptosystem as follows.

Key Generation: Generate distinct primes $$p$$ and $$q$$ and let $$n=pq.$$ Also let $$m=\phi(n)=(p-1)(q-1)$$ and pick any $$1<e<m.$$ Calculate $$d$$ as the multiplicative inverse of $$e$$ modulo $$m$$ using the extended Euclidean algorithm. The private key is then $$(n, d)$$ and the public key is $$(n, e).$$

Encryption: To encrypt plaintext, first encode the plaintext (by padding it for example) so that it consists of blocks of size less than $$n.$$ Then for block $$a,$$ define $$E(a)=a^e \pmod{n}$$ as the encryption function.

Decryption: To decrypt ciphertext given by $$c = E(a),$$ define $$D(c) = c^d \pmod{n}.$$ We then have $$D(c)=D(E(a))=a^{ed} \pmod{n}.$$ For this to work, we need $$a^{ed}=a \pmod{n}.$$ The following theorem then shows that the encryption and decryption algorithms satisfy $$D(c)=D(E(a))=a^{ed}=a \pmod{n}.$$

To prove correctness the main key is to show that $$a^m = a \pmod{n}.$$

Theorem

Let $$p,q,n,m,e$$ and $$d$$ be given as above. Then for any $$0<a<n,$$ $$a^{ed} = a \pmod{n}.$$ Proof First, assume that $$\gcd(a, n) = 1.$$ This means that $$a$$ is in $$Z^*_n,$$ that is, the multiplicative group of nonzero integers modulo $$n.$$ Since $$|Z^*_n| = m = \phi(n),$$ we have $$a^m = 1 \pmod{n}$$ as for any group $$G,$$ and any element $$a\in{G}$$ we have $$a^{|G|} = e_G$$ and $$e_G=1$$ in this case. (For a proof of this, see the appendix on group theory basics.) Since $$ed \equiv 1 \pmod{m}$$ there exists some $$t$$ such that $$ed = tm + 1.$$ This gives $$a^{ed}=a^{tm+1}= a^{mt}a=(a^m)^ta \equiv a \pmod{n}.$$

The case where $$\gcd(a, n) \not= 1$$ means either $$p$$ divides $$a$$ or $$q$$ does, but not both since then $$a=pq=n$$ which can not be since we picked $$0<a<n.$$ Let's assume $$p$$ does, and let $$a=hp,$$ with $$\gcd(q, h) = 1.$$ Then $$a^{ed} \equiv p^{eq}h^{eq} \equiv 0 \equiv ph \equiv a \pmod{p}.$$ Also, $$a^{ed} \equiv p^{ed}h^{ed} \equiv 1 \equiv ph \equiv a \pmod{q}$$ because $$q$$ does not divide either $$p$$ or $$h.$$ So we have that $$a^{ed}$$ and $$a$$ have the same remainder modulo both $$p$$ and $$q.$$ By the Chinese remainder theorem, we get the two are then equivalent modulo $$n=pq.$$ The case for $$q$$ dividing $$a$$ is symmetric. $$\blacksquare$$

## The Fun Part: Implementation and Practicalities

Let's put the pieces the puzzle together now and implement a simple RSA cryptosystem in Python. As we will see, each piece of the implementation here ends up being related to interesting and well-studied areas of mathematics and computer science.

### Key Generation and Primality Tests

To start, our key generation requires us to generate two primes $$p$$ and $$q.$$ We note above that if $$n=pq$$ is small enough then an attacker can simply factor $$n$$ to get the two primes, use them to calculate $$\phi(n)$$ and from there calculate $$d=e^{-1}.$$ Hence we need $$n$$ and therefore $$p$$ and $$q$$ to be large.

To generate a large prime $$p,$$ we start by deciding the number of bits we require in the prime. This number is one of the key parameters that determines the strength of our cryptosystem. Let's call this number $$b$$ and let's assume for simplicity that $$b>8$$ for all our intended purposes. In practice, $$b$$ is usually between $$512$$ and $$2048.$$ As I typed this, I decided to check Facebook and GMail to see what key strength they use in their secure TLS connections. Let's look at Facebook first:

OpenSSL> s_client -connect facebook.com:443
CONNECTED(00000003)
[...]
New, TLSv1/SSLv3, Cipher is AES128-GCM-SHA256
Server public key is 2048 bit


So $$2048$$ bit keys are used here. This means each prime is $$1024$$ bits long. And GMail:

OpenSSL> s_client -connect mail.google.com:443
CONNECTED(00000003)
[...]
New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES128-GCM-SHA256
Server public key is 1024 bit


So GMail is using $$1048$$ bit keys, meaning each prime is $$512$$ bits long.

To generate a $$b-\text{bit}$$ prime $$p$$ we start by generating $$b$$ bits of random data, and then manually making sure both the leading and trailing bits are ones to ensure the number is of size greater than $$2^{b-1}$$ (we don't want to generate, say, $$3$$ by chance, no matter how small the probability) and that it is an odd number. (Side note on entropy of the primes generated: forcing these bits to be $$1$$ here reduces the true entropy of the primes, more on this in a bit.) Let $$t$$ denote the number given by the manipulated random bits as described above. Then we proceed to check $$t+i$$ for primality starting from $$i=0$$ and onwards. As soon as a prime is reached, we let $$p$$ be equal to it. It is generally a good idea to give up on an interval after a certain number of attempts. To do this effectively, we need to consider the Prime Number Theorem which, letting $$\pi(n)$$ denote the number of primes less than or equal to $$n,$$ tells us:

\begin{equation*} \lim_{n=1}^{\infty}\frac{\pi(x)}{\frac{n}{\ln(n)}}=1. \end{equation*}

In other words, there are roughly $$\frac{n}{\ln(n)}$$ primes smaller than or equal to $$n$$ and the larger $$n$$ gets, the better this estimate becomes. This translates directly to the heuristic that around $$2^b,$$ for $$b>8$$ or so, a prime appears in about every $$b$$ consecutive numbers. However, this is only a heuristic, and large gaps in prime numbers exist. To avoid being trapped in one of these large gaps, we decide to give up on $$t$$ after say, $$2b$$ numbers consecutive to it have been tested and failed. In such a case, we simply generate $$b$$ new random bits and repeat the process all over. The end result ends up looking like this:

def generate_random_prime(bits, primality_test):
"""Generate random prime number with n bits."""
get_random_t = lambda: random.getrandbits(bits) | 1 << bits | 1
p = get_random_t()
for i in itertools.count(1):
if primality_test(p):
return p
else:
if i % (bits * 2) == 0:
p = get_random_t()
else:
p += 2  # Add 2 since we are only interested in odd numbers


This function gets in an infinite loop if there are no $$b-\text{bit}$$ prime numbers. This can only happen with small $$b,$$ so we ignore this possible source of bug at this point in favour of an efficient implementation.

Next, we notice in the above that the primality test is the very core of the algorithm. Let's first start by trying a very simple primality test and seeing its performance. To measure the performance, the logged decorator from the Python decorators post is used. First the primality test:

@logged("%b %d %Y - %H:%M:%S")
def simple_is_prime(n):
"""Returns True if n is a prime. False otherwise."""
if n % 2 == 0:
return n == 2
if n % 3 == 0:
return n == 3
k = 6L
while (k - 1) ** 2 <= n:
if n % (k - 1) == 0 or n % (k + 1) == 0:
return False
k += 6
return True


This is the somewhat more evolved version of the algorithm that checks $$n$$ against divisibility by all numbers smaller than or equal to $$\lfloor \sqrt{n}\rfloor.$$ Our version first tests $$2$$ and $$3$$ and afterwards against numbers of the form $$k=6i\pm{}1$$ since these are the only numbers not divisible by $$2$$ and $$3.$$ Let's look at the efficiency:

>>> for b in [8, 16, 32, 44, 45, 46]:
...     print "Generating %d-bit prime: " % b
...     print generate_random_prime(b, simple_is_prime)
...
Generating 8-bit prime:
- Running 'generate_random_prime' on Aug 21 2013 - 13:53:42
- Finished 'generate_random_prime', execution time = 0.000s
337
Generating 16-bit prime:
- Running 'generate_random_prime' on Aug 21 2013 - 13:53:42
- Finished 'generate_random_prime', execution time = 0.000s
129737
Generating 32-bit prime:
- Running 'generate_random_prime' on Aug 21 2013 - 13:53:42
- Finished 'generate_random_prime', execution time = 0.027s
6088203911
Generating 44-bit prime:
- Running 'generate_random_prime' on Aug 21 2013 - 13:53:42
- Finished 'generate_random_prime', execution time = 0.876s
34009683800003
Generating 45-bit prime:
- Running 'generate_random_prime' on Aug 21 2013 - 13:53:43
- Finished 'generate_random_prime', execution time = 1.139s
44586027757781
Generating 46-bit prime:
- Running 'generate_random_prime' on Aug 21 2013 - 13:53:44
- Finished 'generate_random_prime', execution time = 2.215s
87012671597869


On my computer, around $$b=42$$ is when the time required to generate a prime becomes noticeable. Of course, one run is not a very accurate measurement at all because of the randomness in the generation algorithm. However, we can easily see that while the algorithm above is of order $$O(\sqrt{n})$$ in $$n,$$ it is of order $$O(2^b)$$ in $$b.$$ In other words, it is exponential in $$b.$$

Fortunately, the problem of determining whether a number is prime or not is in $$P.$$ This means there is a deterministic polynomial time (polynomial in $$b$$ and not $$n$$ as shown above) algorithm for it. The first deterministic algorithm to accomplish this is called AKS which is named after the authors of the seminal paper "PRIMES is in P" by Agrawal, Kayal, and Saxena. You can read the original paper here.

However, a practical implementation of this algorithm ends up being quite inefficient, though still better than our simple algorithm above for large numbers. One of the better algorithms is Rabin-Miller which is also polynomial time in $$b,$$ but unlike AKS, Rabin-Miller is not deterministic. This means that the algorithm makes use of random numbers, and there is a chance it might not return the correct result at the end. With the proper choice of parameters, Rabin-Miller can reduce the probability of a false positive to astronomically small though, while still being quite efficient.

The basic idea of Rabin-Miller algorithm is to notice that if $$n$$ is prime then for any $$0<x<p$$ we have $$x^{n-1}\equiv 1 \pmod{n},$$ as we already saw. Let $$n-1=2^sm$$ where $$m$$ is odd. Now consider the finite sequence of $$s+1$$ numbers

\begin{equation*} x_i = x^{2^im} \mod n \quad \text{ for } 0 \le i \le s. \end{equation*}

Notice that the last number in the sequence $$x_s=x^{2^sm}=x^{n-1}=1 \mod n.$$. The sequence was defined based on the idea that each number is the previous number, squared, modulo $$p$$. That is, $$x_{i+1}=x_i^2 \mod n.$$ Because of this, we notice that if $$x_0=1$$ or $$x_0=-1$$ then the rest of $$x_i=1$$ for $$i>0$$ and there is not much information we can gain from this. Suppose that is not the case. Then $$2$$ divides the multiplicative order of $$x$$ (because $$x^{2^is} \equiv 1 \pmod{p}$$ for some $$i>0$$) which means that $$-1$$ is in the multiplicative subgroup generated by $$x.$$ This is given by the fact that $$-1$$ is the unique element in $$\mathbb{Z}_p$$ with order $$2$$ (because $$x^2-1$$ can not have any more than two roots) and Cauchy's theorem. Hence we must have $$x_i=-1$$ for some $$i.$$ If no such $$i$$ exists, then $$p$$ is certainly not a prime. This is the main condition used in Rabin-Miller.

If $$x$$ gives us a sequence $$x_i$$ that does not satisfy the above requirement, then $$x$$ is said to be a witness to $$n\text{'s}$$ compositeness, because we can use $$x$$ to prove $$n$$ is composite. That is, $$x$$ testifies that $$n$$ is a composite, hence the choice of word for it.

In algorithm form we get the following.

Rabin-Miller Algorithm
Given $$n$$ first calculate $$s$$ and $$m$$ such that $$n-1= 2^sm$$ and $$m$$ is odd. Then pick a random $$0 < x < p$$ and calculate $$x_0=x^m \mod n.$$ If $$x_0$$ is either $$1$$ or $$-1$$ then $$x$$ is not a witness. Try another $$x.$$ Otherwise, calculate $$x_1=x_0^2 \mod n,$$ and $$x_2=x_1^2 \mod n$$ up to $$x_s=x_{s-1}^2 \mod n$$ by squaring the previous term modulo $$n.$$ If $$1$$ is encountered before $$-1$$ then $$x$$ is a witness and $$n$$ is a composite. Otherwise if $$x_s=1$$ and $$-$$ is encountered before then $$x$$ is not a witness. Try another $$x.$$ Once "enough" $$x$$'s have been tried without any witnesses being found, the we can conclude that $$p$$ is "most likely" a prime.

Why is it that at the end we can't conclude with certainty that $$p$$ is a prime? Suppose we pick an $$x$$ and the generated sequence $$x_i$$ satisfies the requirements. Does that mean $$n$$ is a prime? Answer is no, because for some $$x$$ composites will also satisfy the requirement above too. However, if we pick many possible witnesses and try them all, and they all fail to testify to $$n$$ then we have more and more evidence toward $$n$$ being prime. If we check all such $$x$$ then we can be certain, but then the algorithm would be exponential in $$b$$ again which is what we are avoiding. However, if we could have some idea of the probability of $$x$$ being a witness for a composite $$n$$ then we can come up with a reasonable number of random attempts at witnesses before we can say that $$n$$ is almost certainly a prime.

And this is the key part of Rabin-Miller's paper, where they prove for a given composite $$n,$$ at least $$^3/_4$$ of numbers $$0<x<n$$ are witnesses to $$n$$ being a composite. (The proof for this claim is rather involved, but have a look at the original paper if you are feeling brave!) This means that if we pick $$k$$ random possible witnesses given that $$n$$ is composite, we will have a probability of $$(\frac{1}{4})^k$$ of none of them being a witness. So picking a $$k$$ value of say $$50$$ will give us a probability of $$(\frac{1}{2})^{100}$$ of outputting a false positive, that is, outputting PRIME when the number is a composite. This is good enough for our purposes here.

Also, in practice, one can make the Rabin-Miller algorithm much more efficient by implementing a basic test of divisibility by the first, say, $$100$$ primes before moving on to Rabin-Miller.

Let's look at a Python implementation below.

def rabin_miller_is_prime(n, k=20):
"""
Test n for primality using Rabin-Miller algorithm, with k
random witness attempts. False return means n is certainly a composite.
True return value indicates n is *probably* a prime. False positive
probability is reduced exponentially the larger k gets.
"""
b = basic_is_prime(n, K=100)
if b is not None:
return b

m = n - 1
s = 0
while m % 2 == 0:
s += 1
m //= 2
liars = set()
get_new_x = lambda: random.randint(2, n - 1)
while len(liars) < k:
x = get_new_x()
while x in liars:
x = get_new_x()
xi = pow(x, m, n)
witness = True
if xi == 1 or xi == n - 1:
witness = False
else:
for __ in xrange(s - 1):
xi = (xi ** 2) % n
if xi == 1:
return False
elif xi == n - 1:
witness = False
break
xi = (xi ** 2) % n
if xi != 1:
return False
if witness:
return False
else:
return True


In the code above, basic_is_prime simply checks the number against divisibility by the first K numbers. Here's the source for it, which relies on a list primes_list.less_than_hundred_thousand to have been set already.

def basic_is_prime(n, K=-1):
"""Returns True if n is a prime, and False it is a composite
by trying to divide it by two and first K odd primes. Returns
None if test is inconclusive."""
if n % 2 == 0:
return n == 2
for p in primes_list.less_than_hundred_thousand[:K]:
if n % p == 0:
return n == p
return None


Another part of the algorithm above that we get for free in Python but is worth studying a bit further is the pow(x, m, n) function which calculates $$x^m \pmod{n}$$ efficiently. How? The naive algorithm would simply multiply $$x$$ by itself $$m$$ times, but Python's $$pow$$ function is smarter than that and uses a method known as "exponentiation by squaring", which runs in $$O(\log m)$$ time. The basic idea is simple:

\begin{equation*} x^m= \begin{cases} (x^{\frac{m}{2}})^2 & \mbox{ if } m \mbox{ is even} \\ x \cdot (x^{\frac{m-1}{2}})^2 & \mbox{ if } m \mbox{ is odd}. \end{cases} \end{equation*}

Which leads to a divide and conquer algorithm. This is what a simple iterative version of this algorithm would look like in Python. However, use the built-in pow(x, m, n) function in Python instead of this, since it's going to be faster.

def power(x, m, n):
"""Calculate x^m modulo n using O(log(m)) operations."""
a = 1
while m > 0:
if m % 2 == 1:
a = (a * x) % n
x = (x * x) % n
m //= 2
return a


Now, let's try generating some primes using this test instead:

>>> for b in [2 ** i for i in xrange(3, 12)]:
...    print "Generating %d-bit prime: " % b
...    print generate_random_prime(b, rabin_miller_is_prime)
...
Generating 8-bit prime:
- Running 'generate_random_prime' on Aug 28 2013 - 20:32:32
- Finished 'generate_random_prime', execution time = 0.000s
503
Generating 16-bit prime:
- Running 'generate_random_prime' on Aug 28 2013 - 20:32:32
- Finished 'generate_random_prime', execution time = 0.001s
129757
Generating 32-bit prime:
- Running 'generate_random_prime' on Aug 28 2013 - 20:32:32
- Finished 'generate_random_prime', execution time = 0.002s
5483745739
Generating 64-bit prime:
- Running 'generate_random_prime' on Aug 28 2013 - 20:32:32
- Finished 'generate_random_prime', execution time = 0.003s
20207866191915625427
Generating 128-bit prime:
- Running 'generate_random_prime' on Aug 28 2013 - 20:32:32
- Finished 'generate_random_prime', execution time = 0.008s
378992665877954075225130181634050346399
Generating 256-bit prime:
- Running 'generate_random_prime' on Aug 28 2013 - 20:32:32
- Finished 'generate_random_prime', execution time = 0.013s
169109964058021108360022059334779755034470159052847134197148573143017688247123
Generating 512-bit prime:
- Running 'generate_random_prime' on Aug 28 2013 - 20:32:32
- Finished 'generate_random_prime', execution time = 0.062s
25984244777989501997155542913156869882081307629782603141084479047977114447740881200477139623732813907010625928930923928392846989307780602640371132948572817
Generating 1024-bit prime:
- Running 'generate_random_prime' on Aug 28 2013 - 20:32:32
- Finished 'generate_random_prime', execution time = 0.908s
243882388542316536634494263003432392747710195101405655638525548586460245284788605889744383018972173790715282433484881304387337823469782781741768460522379459266510277700837051129455157916910581124086417098371511903523124351405930407269823757127633787628818261404275424247249951449105500355258546878369871013997
Generating 2048-bit prime:
- Running 'generate_random_prime' on Aug 28 2013 - 20:32:33
- Finished 'generate_random_prime', execution time = 5.533s
39776036539901478897149714325867816317685970320648758660437997010070791355113972360487704959106715808827982189462166321507780789030474348830734451482712904952797349414199357350610796728394943895699307545779410349704150772133800816078777678410296457461265358640627126060911629419050162680462367824593593599185377995313654142814587795430128247012676213933608425620507345628348437045547639588191826106308640857920441463747225809423632900984055482095205841726987368122970422928048300140432181860090536264169603189938581906988427448587564353276787633192654461220520187517077846414138298337591988489586417244684890216725019


As you can see, primes up to $$512$$ bits are generated in negligible time. Even a $$2048$$ bit prime is generated in about five seconds which is quite impressive for a rather simple Python implementation.

Finally, let's put it all together to implement RSA, for which we will need an implementation of the extended greatest common divisor algorithm to determine the modular multiplicative inverse of a number.

def extended_gcd(a, b):
"""Returns pair (x, y) such that xa + yb = gcd(a, b)"""
x, lastx, y, lasty = 0, 1, 1, 0
while b != 0:
q, r = divmod(a, b)
a, b = b, r
x, lastx = lastx - q * x, x
y, lasty = lasty - q * y, y
return lastx, lasty

def multiplicative_inverse(e, n):
"""Find the multiplicative inverse of e mod n."""
x, y = extended_gcd(e, n)
if x < 0:
return n + x
return x

def rsa_generate_key(bits):
p = generate_random_prime(bits / 2)
q = generate_random_prime(bits / 2)
# Ensure q != p, though for large values of bits this is
# statistically very unlikely
while q == p:
q = generate_random_prime(bits / 2)
n = p * q
phi = (p - 1) * (q - 1)
# Here we pick a random e, but a fixed value for e can also be used.
while True:
e = random.randint(3, phi - 1)
if fractions.gcd(e, phi) == 1:
break
d = multiplicative_inverse(e, phi)
return (n, e, d)

def rsa_encrypt(message, n, e):
return modular.power(message, e, n)

def rsa_decrypt(cipher, n, d):
return modular.power(cipher, d, n)


And let's look at it in action:

>>> n, e, d = rsa_generate_key(8)
>>> print n, e, d
589 299 419
>>> message = 123
>>> cipher = rsa_encrypt(message, n, e)
>>> print cipher
309
>>> decrypted_message = rsa_decrypt(cipher, n, d)
>>> print decrypted_message
123


And same thing but with a much larger key, giving a more secure encryption scheme:

>>> n, e, d = rsa_generate_key(64)
>>> print n, e, d
49386969680342799959 39516096781065686051 14613413672463512651
>>> message = 123
>>> cipher = rsa_encrypt(message, n, e)
>>> print cipher
19163520560172629876
>>> decrypted_message = rsa_decrypt(cipher, n, d)
>>> print decrypted_message
123


And there it is. A very basic implementation of RSA that is still capable of handling rather large keys. Notice that the encryption and decryption algorithms are basically just modular exponentiation. Because of its importance in RSA's efficiency, modular exponentiation has been studied quite a bit in applied cryptography. A more efficient implementation would be using Montgomery reduction which I might go into in a future post.

## Appendix: Basic Number Theory Disguised as Group Theory

A few results from group theory that we use above are reviewed and proved here. Most of these can also be expressed as results in number theory in less abstract forms, but let's be honest, they look way cooler and are much easier to remember when they are written in the language of group theory.

Remember that for element $$a$$ in group $$G,$$ we define the order of $$a$$ as $$|a|$$ to be the smallest positive integer such that $$a^{|a|}=e_G$$ if such a number exists, and $$\infty$$ otherwise.

Theorem
For any element $$a$$ in group $$G,$$ $$a^k=e_G$$ iff $$k$$ divides $$|a|.$$ Proof If $$k=t|a|$$ for some integer $$t$$ then $$a^k=a^{t|a|}= (a^{|a|})^t=e_G^t=e_G.$$ Conversely, suppose $$a^k=e_G.$$ Let $$q, r$$ be such that $$k=q|a|+r$$ and $$0\le{r}<|a|.$$ Then $$e_G = a^{k}=a^{q|a|+r}=a^{q|a|}a^r=e_Ga^r =a^r.$$ Hence $$a^r=e_G$$ which means $$r=0,$$ giving $$k=q|a|.$$ $$\blacksquare$$
Theorem
Given group $$G,$$ if $$a\in{G}$$ has finite order then so does $$a^k$$ for any integer $$k$$ and furthermore
\begin{equation*} |a^k| = \frac{|a|}{\gcd(|a|, k)}. \end{equation*}

Proof Let $$n = |a|,$$ $$t = |a^k|,$$ $$d=\gcd(n, k)$$ and $$m=\frac{n}{d}.$$ We need to show $$t=m.$$ First, since $$d$$ divides both $$n$$ and $$k,$$ we have $$n=md$$ and $$k=ld$$ giving

\begin{equation*} (a^k)^m = a^{mk} = a^{\frac{n}{d}ld}=a^{nl}=(a^n)^l = e_G^{l} = e_G. \end{equation*}

This means $$m$$ divides $$t.$$ On the other hand, since $$(a^k)^t= a^{kt} = e_G$$ by definition. This also means that $$kt$$ divides $$n.$$ So $$kt = ns$$ for some integer $$s.$$ This gives $$ldt = mds$$ which simplifies to $$lt = ms$$ by cancelling $$d$$ from both sides. Now, $$m$$ and $$l$$ are relatively prime since we already cancelled out their greatest common divisor $$d.$$ Since $$m$$ is a divisor $$lt$$ then it must divide $$t.$$ So we have have $$m$$ dividing $$t$$ and $$t$$ dividing $$m$$ giving equality as needed. $$\blacksquare$$

We only state the following important result from group theory and use it, without proof. A proof can found in most basic group theory text books and is achieved by considering the cosets of the subgroup $$H.$$

Lagrange's Theorem
For any finite group $$G$$ and any subgroup $$H$$ of $$G,$$ we have $$|H|$$ divides $$|G|.$$
Corollary
For any finite group $$G$$ and any $$a\in{G},$$ we have $$a^{|G|}=e_G.$$ Proof By Lagrange's theorem $$|a| = |\langle{a}\rangle|$$ divides $$|G|.$$ The result then follows from the first theorem in this section. $$\blacksquare$$