Understanding Two-Step Verification With An Example Using Python and Google Authenticator

I have been using Google Authenticator to secure most of my online accounts, well the ones that support it anyway, for a couple of years now, and have been encouraging friends to use it as well. So when I had to pick a project to work on for my cryptography course, I decided to do it on Time-Based One-Time Passwords (TOTP) which are used by Google Authenticator for two-step authentication. This post is based on my project's report, and is an introduction to two-step authentication, its advantages, and its implementations using HOTP and TOTP algorithms. An example of how to enable Google Authenticator based two-step verification on a website is given in Python. The example uses Flask, pyotp, and qrcode and is hosted on Heroku.

The slides for this project are available here. The source code for the example demo is available on GitHub at https://github.com/sahands/python-totp and you can access a live version of the example at http://python-totp.herokuapp.com/.

The first few sections provide an introduction to authentication methods and a bit of theory, so if you are already familiar with the terminology and the basics, jump to the HMAC-Based One-Time Password Algorithm (HOTP) section.

Definitions and Motivation

One-time passwords are used as ownership authentication factors. First, let's define the problem of authentication. For simplicity I will use the basic scenario of Alice sending a message to Bob. Alice is called the originator and Bob the receiver. The problem of authentication is that of Bob, the receiver, wanting to confirm that the originator of a message he receives, claiming to be from Alice, is truly Alice.

Authentication is often done by attaching information to transmitted messages. Each such attached piece of information is called an authentication factor. The most common authentication factor used (at least on the web) is a password.

Of course, in practice, Alice would not attach her password to every message she sends Bob. Instead, she will send a login message to Bob containing her ID and a hash of her password, and after Bob verifies the authenticity of the login message, he generates a login token which will usually be valid for an amount of time which Bob decides on based on security needs, and sometimes based on whether Alice included a "remember me" option in her login message. But for the purposes of this article, the exact process is besides the point and we can simplify things to Bob needing to authenticate a single message from Alice. In practice, that usually ends up being the login message, and everything after will be authenticated using the token instead while the login token is still valid.

There are three general types of authentication factors:

Knowledge Factors: Something the originator knows. Examples are passwords, birth dates, first car's model, friend's faces, and amount of last payment made to a credit card.
Ownership Factors: Something that the originator owns. Examples are badges, ID cards, cellphones, keys, etc.
Inherent Factors: Something inherent to the originator. Examples are DNA, facial features, fingerprints, etc.

Knowledge factors can in turn be categorized as static and dynamic. Static factors are knowledge factors that do not naturally change over time. Passwords are static knowledge factors because they only change when the receiver and originator change the shared password. The last payment made to a credit card is dynamic, in contrast to passwords, because every time a new payment is made, this shared knowledge changes. Another example of dynamic knowledge factors used for authentication would be Facebook asking you identify your friends based on their profile pictures. Since your friends and their profile pictures change over time, this is also a dynamic knowledge factor.

Throughout this article, assume that the originator and receiver can establish a secure channel of communication. This can be done through use of TLS/SSL, for example.

Static Knowledge Factors: Passwords and Secret Questions

A password is a secret key that is shared between Alice and Bob. Typically, Alice sends Bob her password along with the message, and Bob then compares a hash of the password with the hash of the correct password that it has stored and if they match, Alice's message is authenticated.

Passwords are simple to implement, and require no specialized hardware. Because of that, they are the most commonly used form of authentication on the internet. However, as static knowledge factors, passwords are susceptible to brute-force attacks, guessing, and social engineering (e.g. phishing), among other security issues.

Another issue with password-only based authentication is that once an attacker has the password, he gains full access to the account. This often means that the attacker can change the password, effectively cutting off Alice's access to her account until she can convince Bob, through means other than knowledge of her old password, that her account is compromised.

And of course, all of these problems are exacerbated by password reuse. Since many users will use the same password for various accounts, once an attacker gains access to one account, he can then gain access to many others.

One way to strengthen a password-based authentication system is two-factor authentication. This means attaching two authentication factors to a message instead of one. One of the most common ways this is often implemented in practice is through the use of secret questions. These are questions such as "What was your first car's model?"

Secret questions are also static knowledge factors and are hence susceptible to the same forms of attacks. Worse, knowledge of the question reduces the entropy of the answers heavily, thereby making them very poor authentication factors. Just think about how easy it is to find out a person's birth date, first car model, or where they grew up by, say, getting to know them, or getting them to fill in a "survey" online for $10, or simply having access to their social networking account.

To conclude, passwords alone are poor authentication factors means, and secret questions are poor choices of second factors.

One Time Passwords

An alternative solution, and one that I would argue is much more secure, is the use of one-time passwords.

The basic protocol for one-time passwords is as follows:

Alice sends Bob a request for authentication.
Bob generates a random one-time password and sends it to Alice through a trusted channel.
Alice attaches the one-time password to her message and sends it to Bob.
Bob checks to see if the attached one-time password is the same that he sent to Alice.

Here, the security relies on the channel of communication for the one-time password being trusted, in the sense that Bob has reason to believe that Alice (and only Alice) will receive the one-time password. This is often achieved by using cellphones as the channel of communication. That is, Bob will send Alice the one-time password by sending a text message to, or calling, Alice's cellphone.

A different scheme, which removes the need for communicating the one-time password on every authentication attempt is:

Alice and Bob establish a secure and authenticated channel of communication.
Bob generates a random strong secret key and shares it with Alex, both storing it.
For future authentication requests, Alice and Bob pass the shared secret key plus a "counter" value to a cryptographic pseudo-random function and then extract a one-time password from the result
Alice sends the generated one-time password to Bob along with the authentication request.
Bob checks the sent one-time password with the one he generated using the same "counter". If they match, authentication succeeds.

This is the scheme that is implemented in HOTP and TOTP, which only differ by their choice of counter value.

HMAC-Based One-Time Password Algorithm (HOTP)

The basic idea behind the Hash-Based One-Time Password (HOTP), as described in RFC 4226, is to use a shared secret key (separate from the password) and a counter as an input to a cryptographic hash function and use the output as the one-time password. The shared secret key is exchanged at some point when Alice is already authenticated, and is then stored by both Alice and Bob. Alice will have to store the key on her device that will generate the one-time passwords, and Bob will have to store it on his server that will do the verification of the one-time passwords. A counter value $C$ is initialized to some initial value as well, usually $C=0$ .

Since Alice does not have to memorize or type the shared secret key, this key can be much longer and complicated, and hence much stronger than a password. It is also much less likely to be stolen through social engineering for example, since Alice will not even know it without having to extract it out of the file system, for example. The key will be stored, in encrypted form most likely, on her device.

The cryptographic hash function that is used in the HOTP standard is HMAC-SHA-1. After HMAC-SHA-1 is applied, a truncation algorithm is run on the result to extract D digits from the result to be used as the one-time password. Hence the one-time password is calculated as

k_C = Truncate(Hash(k \| C)).

(Here $\|$ means bitstring concatenation.)

After each authentication attempt, the counter value is incremented by both Alice and Bob.

There are few issues with HOTP. First is that the one-time passwords generated for a given counter value are valid until they are used. This means that someone can use the Alice's device to generate a key and then use the key to authenticate with Bob. Such an attacker will have until Alice's next authentication attempt to use the code, which can potentially be days or months.

The second problem is that of the counter going out of synchronization. The standard provides a process to deal with Alice and Bob's counter values going out of synchronization. However, dealing with this is bit of a technical hurdle, and potentially an inconvenience for the user.

Time-Based One-Time Password Algorithm (TOTP)

The Time-Based One-Time Password (TOTP) algorithm described in RFC 6238 is very similar to HOTP but uses a time-based value instead of a counter value, which solves the aforementioned. The time-based value $T$ is simply calculated as

T = \left \lfloor \frac{T_{current} - T_0}{X} \right \rfloor

where $T_{current}$ is the number of seconds since epoch and $T_0$ is some agreed upon initial time, usually taken as $T_0=0$ . The parameter $X$ determines how long the one-time passwords are valid for. The TOTP standard recommends choosing $X=30,$ allowing the one-time passwords to be valid for 30 seconds each. This provides a nice balance of usability and security.

Provided that Alice and Bob have (almost) synchronized clocks, then they will produce the same $T$ value in any given 30-second window of time. Then the following is calculated as the one-time password, using the same truncation algorithm and hash function as in HOTP:

k_T = Truncate(Hash(k \| T)).

(Again, reminder that $\|$ means bitstring concatenation.)

Implementation in Python

A demo implementation of TOTP's using Python (packages Flask, pyotp, and qrcode in particular) running on Heroku was done as part of the project and is accessible at http://python-totp.herokuapp.com/. The source code is available at https://github.com/sahands/python-totp.

Sources

RFC 4226 - HOTP: An HMAC-Based One-Time Password Algorithm: https://www.ietf.org/rfc/rfc4226.txt
RFC 6238 - TOTP: Time-Based One-Time Password Algorithm: https://tools.ietf.org/html/rfc6238
Python pyotp Package: https://github.com/nathforge/pyotp

Math ∪ Code

by Sahand Saba