Learn PKI in 30 Minutes
This guide will explain all the terminologies behind Public Key Infrastructure (PKI). My target audience is programmers. I assume that you already have basic understanding of how public key cryptography works. Alright, let's get started.
Before continuing, make sure you familiarize yourself with how encryption and signing works using public key cryptography.
Consider this scenario:
Charles: Hey Bob, can you send a confidential email to Alice?
Bob: Who is that?
Charles: She's the product manager from ABC Ltd.
Bob: OK
So Bob tries to find the public key of Alice. He searches the Web and finds several of them. How can Bob make sure he is using the correct public key from the Alice he wants to contact?
To solve this problem, we have what we call certificates. A certificate is a thing that contains essentially two things:
- A person's name, email addresses,etc.
- A person's public key
So if Bob can find a certificate that says public key A binds to Alice from ABC Ltd., he can make sure that public key A is indeed the correct key to use.
PKI is essentially a whole field that revolves around juggling these things called certificates.
But wait, what if Bob also finds several certificates that claim to have the public key of Alice? Which one can he trust?
Here comes Certificate Authorities (CAs). Basically, there are a dozen of people in the world that we trust to be authoritative. Many OSes come initialized with the information of these CAs. If a CA says a certificate is authentic, we'll believe it.
How does a CA "say" that another certificate is authentic? By signing it. Because every system in a Public Key Infrastructure must be seeded with the public keys of trusted CAs, everyone should be able to verify that something is signed by a trusted CA.
So, let us refine our view of what a certificate is:
- A person's name, email addresses,etc.
- A person's public key
- The Issuer of this certificate.
- The encrypted hash of this certificate. (So that a receiver can verify the signature of the issuer.)
Let's look at an example scenario:
- Bob receives certificate X; he looks at it, it claims to be Alice's certificate and is issued by A.
- In order for Bob to verify that X is valid, he downloads the public key of A from somewhere, and decrypts the encrypted hash inside X to see if it matches. If it matches, then X is valid.
- Now the question is, Bob just downloaded A's certificate from somewhere, how does he know it's valid? A's certificate is in turned signed by B, which is a trusted CA that is seeded from his computer installation. So he just uses B's public key to verify that the version of A's certificate that he used, was indeed issued by B.
- Since B issued A issued X, we can say B trusts A trusts X. Since B is a trusted CA, we'll trust what he trusts. So Bob can safely trusts the authenticity of X.
The exact method that Bob can use to obtain the certificate for A is not specified by RFC 5280. What that basically means is that people are free to use whatever protocols that makes sense. For example, in PDF documents, it's common practice to dump the whole chain of certificates required to verify the signer, so a viewer does not need to go downloading every certificate in the chain.
It's the chicken and egg problem. It turns out that CAs can sign their own certificates. These are called self-signed certificates. Everyone is supposed to obtain their public keys through secure means (going to CAs' offices and meeting them personally, for example), because PKI alone cannot ensure the validity of any self-signed certificates. We usually refer to the "top level" CAs as Root CAs. And we usually refer to this type of "going to their offices" type of thing out-of-band (e.g. out-of-band distribution of certificates means distributed non-electronically).
Most of the time, though, this is less rigorous than that. For example, your Web browser usually comes seeded with about 30+ Root CAs. The browsers' vendors have nominated who are trustworthy and so by using the browser, you trust their decisions.
I think the above pretty much explains the general theme of a PKI. The The remaining are only big words and complicated details to achieve what we just described. Here's a short list of explanations to understand what most of them actually are behind the big words:
- X.509
- This is the protocol that specifies most of these things.
- ASN.1
- It's the syntax used to describe the things in a certificate. If certificates were written in XML, then ASN.1 would be the schema's syntax. (This is over-simplified. ASN.1 can indeed use XML. Wikipedia's page on ASN.1 actually sums it up quite well.).
- PKIX
- An organization that writes RFCs on these things.
- Algorithm
- When used in a PKI context, this means things like "RSA with SHA1", "DSA with SHA1", etc. If you've read up on cryptographic signing, you'll know that we need to 1) hash something 2) encrypt the hash. "RSA with SHA1" would mean that we hash with SHA1, and then encrypt with RSA.
- Object Identifiers (OIDs)
- OK this is messy. They figured out that we don't want to use English to describe something in a X.509 certificate. So they came up with numbers. For example, when we want to refer to "RSA" in a certificate, we don't put in the string "RSA". Instead, we'll use its OID "1.2.840.11359". You can read about registered OIDs here.
- DER
- Distinguished Encoding Rule. It's the format that a certificate is used to implement what's promised in the ASN.1 specification. It's easy to understand when you know there is another encoding rule called XER -- XML Encoding Rule.
- PEM
- Privacy Enhanced Mail. This was supposed to be another format (encoding rule) to encode a certificate, in clear text. Nowadays, though, when people say PEM it usually means DER further encoded to Base64 (using only bytes in the range of displayable characters, so it is suitable for distribution through emails.)
- DN
- Distinguished Name. A unique name to identify someone. For example, Karen is probably not really useful to identify someone, so we'll say something like CN=Karen Berge,CN=admin,DC=corp,DC=Fabrikam,DC=COM which looks self explanatory. This MSDN page puts it quite well.
- SPKI
- Simple PKI. The PKI we just described (the RFC 5280 family) binds a certificate with a distinguished name. SPKI describes certificates that bind a public key to a set of permissions. Not many people actually use this as far as I know.
- PKCS
- Public Key Cryptography Standard. You usually see people say PKCS #7 or PKCS #12. These are different chapters of the same standard. For example, PKCS #7 describes how digital signatures should work; PKCS # 12 describes the format that stores a certificate and private key together, etc. I've listed some common and important PKCS standards below.
- PKCS #1
- An RSA public key usually contains two numbers; a private key usually contains one number (the key). This is the file format to describe how to store those numbers in a file.
- PKCS #3
- Describes the Diffie-Hellman Key Exchange mechanism.
- PKCS #7/CMS
- Cryptographic Message Syntax. Describes the actual message that gets signed and/or encrypted. Think of this as the specification of a TCP packet -- some header information wraps the actual data.
- PKCS #9
- A standard on the "meta-data" on a certificate. For example, a certificate can specify that it is only valid for "signing", etc. There is an extension field in PKCS #7 to specify those purposes, and PKCS #9 is the standard format of how to specify those.
- PKCS #10
- When someone sends his certificate to be signed by a Root CA, he's said to be sending a certificate signing request. There are some meta-data that needs to be wrapped and is specified by this PKCS.
- PKCS #12
- The file format that stores a certificate and a private key together in one file.
- CLR
- Certificate Revocation List. These lits are hosted by some central authorities' servers to say which certificates have been revoked, for whatever reason. This works like credit cards' revocation. For example, someone might have his private key stolen and want the certificate revoked. If you think "that would put huge loads on those central servers", you're right. People are still trying to come up with better strategies.
- OCSP
- Online Certificate Status Protocol. An attempt to improve the CLR approach to find out a certificate's revocation status.
- CSR
- Certificate Signing Request. See PKCS #10 above.
# Generate a key pair
openssl genrsa 2048 > my_private_key.pem # 2048-bit RSA
openssl rsa -in my_private_key.pem -pubout > my_public_key.pem
# Now generate a CSR
openssl req -new -key my_private_key.pem > my_cert.csr
# Now act as our own CA and sign the CSR (self-signing)
openssl x509 -req -in my_cert.csr -signkey my_private_key.pem > signed.cer
# Display the information in that certificate
openssl x509 -in signed.cer -text -noout
# Put our key and certificate in a PKCS #12 container
openssl pkcs12 -export -in signed.cer -inkey my_private_key.pem -out p12cert.p12
# Generate a self-signed certificate and key in one command
openssl req -new -x509 -newkey rsa:2048 -out another_cert.cer -keyout another_key.pem