How TLS Certificates work

Introduction

This document provides a small introduction on how TLS certificates work and how they help encrypt the traffic flow between two parties over the internet.

How HTTP works

HTTP (Hyper Text Transfer Protocol) is a protocol used for communication over the internet. Communication between client and servers can be achieved by one party sending HTTP Requests and the other party responding back with HTTP Responses.

HTTP, on its own, is not a secure communication channel. The data sent over the network is not encrypted and thus, anyone who has the ability to listen on this communication channel can understand what the communication is about. This is fine for simple web servers or for developmental purposes but for systems that deal with data such financial transactions or medical records or any other such data that requires privacy, simple HTTP will not suffice.

There are simple tools like Wireshark that can allow us to “sniff” out the traffic happening over a specific network interface. If the data is not encrypted, there is nothing stopping anyone from peeping into the data (and even possibly altering it for malicious reasons).

This is where HTTPS comes in

What is HTTPS?

HTTPS is HTTP Secure. This means, using HTTPS, the data that sent over the internet is encrypted.

What is encryption?

Encryption is a way of altering the original content in such a way that only an intended party can properly decrypt the information and obtain the original content.

There are multiple ways of encrypting data. Let us consider two popular ways:

Symmetric Key encryption

In case of symmetric key encryption, both the sender and the receiver will have a shared key. This key will be used for encrypting the data on the sender’s side and the same key is used to decrypt the data on the receiver’s side.

Consider that you want to ship a trunk full of gold to your bother in another city. First, you buy a lock that has two keys. You send out one of the keys to him over a courier. Then, once he has received the key, you lock the trunk with the key you have and ship it out. No one will be able to open the trunk and get its contents since only you and your brother have the required keys. Thus, everyone involved in the transport of the trunk will only see it as a trunk but cannot open it up, obtain its contents or replace it with something else. Once your brother gets the trunk, he can open the lock with the key you shared with him earlier.

Asymmetric Key encryption

Asymmetric Key encryption is different in the sense that the keys used to encrypt and decrypt the data are different. The RSA (Rivest-Shamir-Adleman) algorithm is one of the most widely used algorithm in this space. Here, instead of having a single key at both ends, there will be a pair of keys. One is a private key and another is a public key. If I a web server who accepts medical data, I shall always ensure that my private key is kept safe and not disclosed to the public. My public key, though, is published to the world. Thus, if someone wants to send some data over to me, they will use my public key to encrypt the data. Only I, with my private key, will be able to decrypt this data.

The public and private key always are a pair. I cannot have a private key and change its public key. Also, any data encrypted with the private key can be decrypted with the public key and vice versa. Which key acts as a private key and which acts as a public one is just up to the owner the keys.

A simple (but probably not the best) analogy here is that of a lock that doesn’t require a key to lock. Thus, anyone can put this lock on something but only I, with my key, can open up this lock.

In today’s internet communication processes, symmetric algorithms are used to encrypt the data that is sent between the client and server (since this is faster than the asymmetric ones). Asymmetric algorithms like RSA are used during the session setup to exchange the symmetric keys.

What is TLS?

TLS (Transport Layer Security) is the current standard for encrypting the communication between web servers and clients. TLS has replaced SSL (Secure Socket Layer) as the default protocol for securing such internet communication. SSL evolved into TLS and thus, sometimes, these terms are used interchangeably.

HTTPS makes use of TLS for securing the internet communication. TLS mainly provides three facilities:

1. Data encryption – Ensure that only the intended parties can read and understand the data

2. Data Authentication – Ensure that communicating parties are actually who they claim to be

3. Data Integrity – Ensure that the data has not been altered or tampered during the transfer over the internet.

What is a SSL / TLS Certificate?

A SSL / TLS certificate is a file usually hosted in a web server which contains the website’s public key and other related information.

The SSL certificate enables anyone communicating with the web server to obtain the web server’s public key and also authenticate the web server.

The SSL certificate contains the following information at least:

1. Domain name which was certified (this can be a single domain, a wild card on a domain or a multi domain)

2. The person or organization the certificate was issued to

3. Which certificate authority authorized it

4. The certificate authority’s digital signature

5. Issue date

6. Expiration date

7. Public key

There can be more details present as well.

To explore more on this, you can open google.com on in Chrome and click on the lock icon that appears next to the name in the address bar:

Next, click on the “Certificate” option and click on the “Details” section. You will see the above mentioned details here.

Also, click on the Certificate Path. You will see a tree of names here.

For a certificate to be globally accepted as a valid, this needs to be issued by a Certificate Authority (CA). A CA is a globally trusted organization who can generate and issue SSL certificates. The CA will also sign the certificates they issue with their private keys and thus shall allow the clients to verify their authenticity.

A self signed certificate, on the other hand, is something any one can directly create with tools like OpenSSL. This will not be directly trusted globally since it will not be issued by a trusted CA.

Chain of trust

The end user certificate that the web servers use is part of a certificate chain. The chain starts at something called a root CA certificate, then can have multiple intermediate CA certificates and then finally ends at the end user certificate:

Here, you can see that, at the top is the Google Trust Services, the Root CA Certificate. Open this certificate:

Now, if you are on Windows, open Certmgr and check for the same name under the Trusted Root Certification Authorities / Certificates:

You can see that the names match and so do the expiry dates. This is the Root CA certificate that is shipped with the OS (or installed through some updates). Thus, the system knows that these are trusted Root CAs.

One other thing to note in the Root CA certificates is that the issuer and the issued to names will be the same since it is a self signed certificate, signed by the Root CA with its private key. Since these certificates are installed with the OS, the system knows to trust these certificates even though they are self signed.

Next in the chain is the intermediate CA certificate (GTS CA 101). You might not find this in your certificate store. Neither will the www.google.com certificate be present here. Then, how does the web browser trust these certificates? This is by a process called Chain of trust.

Let us see how a chain of trust is built using the chain of certificates.

Consider the following diagram:

As we discussed earlier, the Root CA signs its own certificates. The intermediate CA certificates are signed by the private key of the Root CA. The end user certificate is signed by the private key of the intermediate CA.

The Intermediate CA will request the Root CA to provide it with a certificate. What the root CA does is it generates the certificate with all the details provided by the Intermediate CA and sign this. The Root CA then provides this certificate back to the Intermediate CA. For an end user who wants to obtain a certificate, they will send the request to the Intermediate CA. The intermediate CA performs the same set of steps as earlier and it shall sign the end user certificate with the intermediate CA’s private key.

Signing the certificate means creating a hash of the certificate and encrypting this hash with its private key.

How are the certificates validated?

When a HTTPS connection is established, the web server shall send out all of the certificates in it’s certification path (except for the Root CA certificate). Thus, when we access google.com, the google servers shall send out the www.google.com and the GTA CA 101 certificates.

The following are the set of steps followed to verify a signature:

1. Open the end user certificate and verify if the current date is between the issue date and the expiry date of the certificate. If so, proceed. If not, the verification process fails

2. Next, look at the issuer info and try to find the certificate (among the ones received via the HTTPS session) that has the same issued to info. This will be an Intermediate CA certificate.

3. Obtain the public key from the matching Intermediate CA certificate and decrypt the signature field in the end user certificate. Compare this decrypted hash info to the hash value of the end user certificate (the used hash function is also specified in the certificates). If these match, we can be sure that the end user certificate was indeed issued with the Intermediate CA. If not, it means that the end user certificate is not valid.

4. Next, repeat the same process for the Intermediate CA. Find the issuer info of the Intermediate CA and try to find the issuer info in the other certificates downloaded. If not found, look for this in the Certificate store (in the OS). Find the matching one. Repeat the process of verifying that the signature can be decrypted by the appropriate Root CA’s public key and if it matches the generated hash. If this works, this means that the intermediate CA’s certificate is valid and can be trusted.

5. Since the root CA certificate is already trusted, we can now trust the Intermediate CA’s certificate and thus, also the End user certificate becomes trusted.

Once the chain of trust is built and verified, we usually see the lock icon in the address bar of the browser, claiming this web page can be trusted.

One thing to note is that the end user requesting for a certificate shall need to provide proof of actually being in control of the domain for which they are requesting a certificate for. This is done in a set of ways that is out of the scope of this document currently. We can be sure that, I cannot go request an Intermediate CA to provide me with a certificate for google.com and similarly, no one else can go request for a certificate for a domain I own. Thus, if the certificate is trusted, we can also trust that the web page is actually who they claim to be.

Next things to update in this document:

1. How the TLS session enables creating and sharing of the symmetric key that shall be used for encrypting the actual application data

Sources / References:

https://www.udemy.com/course/ssl-complete-guide/

https://www.cloudflare.com/learning/ssl/what-is-an-ssl-certificate/

https://www.cloudflare.com/learning/ssl/transport-layer-security-tls/