CS50, Personal Development

CS50x - Problem Set 2 - Substitution.

encryption, decryption, substitution

Published September 20th, 2024

Problem to solve

In a substitution cipher, we “encrypt” (i.e., conceal in a reversible way) a message by replacing every letter with another letter. To do so, we use a key: in this case, a mapping of each of the letters of the alphabet to the letter it should correspond to when we encrypt it. To “decrypt” the message, the receiver of the message would need to know the key, so that they can reverse the process: translating the encrypt text (generally called ciphertext) back into the original message (generally called plaintext).

A key, for example, might be the string NQXPOMAFTRHLZGECYJIUWSKDVB. This 26-character key means that A (the first letter of the alphabet) should be converted into N (the first character of the key), B (the second letter of the alphabet) should be converted into Q (the second character of the key), and so forth.

A message like HELLO, then, would be encrypted as FOLLE, replacing each of the letters according to the mapping determined by the key.

In a file called substitution.c in a folder called substitution, create a program that enables you to encrypt messages using a substitution cipher. At the time the user executes the program, they should decide, by providing a command-line argument, on what the key should be in the secret message they’ll provide at runtime.

int main(int argc, string argv[ ])

Let's start with the command-line argument and error handling. Here's the start to our main function:


            // Store the length of the key in a constant variable

            const int KEY_LEN = 26;


            int main(int argc, string argv[ ])

            {

               // If the argument count is incorrect OR the key isn't the correct length, print error
               if (argc != 2 || strlen(argv[1]) != KEY_LEN)
               {
                   printf("Usage: ./substitution key\n");
                   return 1;
               }

Next I want to check that each letter is only present once. To do this I will initialize a boolean array with a true/false for each letter. Then I can loop through 26 times, each time converting the letter in the key to uppercase to make the check case-insensitive. When I see a letter, I mark it as true, and if the matching index in the checking array is already true, I can return an error.


               // Initialize a boolean array to track which letters are present
               bool alphabet[KEY_LEN] = {false}


               // Loop through the key length
               for (int i = 0; i < KEY_LEN; i++)
               {
                   // Check character is NOT alphabetical
                   if (!isalpha(argv[1][i])
                   {
                       printf("Error: key must contain only letters.\n");
                       return 1;
                   }


                   // Get the corresponding index for the letter in the alphabet
                   int j = toupper(argv[1][i]) - 'A';


                   // Check if the letter is already in the key
                   if (alphabet[j])
                   {
                       printf("Error: key must only contain each letter once.\n");
                       return 1;
                   }
                   // Otherwise letter hasn't been seen before, so mark it as seen
                   else
                   {
                       alphabet[j] = true;
                   }
               }

Now all our checks are complete. If the key isn't provided at the command line, isn't exactly 26 characters long, isn't only alphabetical characters, or has any repeating letters, it will throw an error and return an exit status of 1. Next, we want to prompt the user for some plaintext, encrypt it with a key, and print the corresponding ciphertext. To do this we'll initialize two variables, plaintext, received from the user, and ciphertext, which will be an empty character array of the same length. Then we'll loop through the plaintext one character at a time, assigning the corresponding ciphertext character after running it through a new encrypt( ) function. Which looks like this:


               // Prompt the user for a string of plaintext
               string plaintext = get_string("Plaintext:  ");


               // Find the length of the string to create a matching ciphertext array
               int len = strlen(plaintext);
               char ciphertext[len + 1];
               ciphertext[len] = '\0';


               // Loop through plaintext, converting then assigning matching ciphertext
               for (int i = 0; i < len; i++)
               {
                   ciphertext[n] = encrypt(plaintext[n], argv[1]);
               }


               // Print ciphertext and return
               printf("Ciphertext: %s\n", ciphertext);
               return 0;
            }

And that's the end of our main function. Now I just need to create the encrypt( ) function.

char encrypt(char c, string key)

This function takes 2 inputs, the character we want to encrypt char c, as well as the key needed to encrypt it string key, it returns as output the corresponding char in ciphertext. To make sure the encryption is case sensitive, we need to handle 3 possibilities, if the character is uppercase, lowercase, or non-alphabetical. Here is the code:


            char encrypt(char c, string key)

            {

               // Return uppercase character
               if (isupper(c))
               {
                   int i = c - 'A';
                   c = toupper(key[i]);
                   return c;
               }
               // Return lowercase character
               else if (islower(c))
               {
                   int i = c - 'a';
                   c = tolower(key[i])
                   return c;
               }
               // Return all non-alphabetical characters
               else
               {
                   return c;
               }
            }

One note about the conversion: the reason I subtract 'A' or 'a' from the given character is to get the correct index of the letter from 0-25. As per the problem's instructions, the letter in the key[0] spot would substitute the letter 'A', key[1] would replace 'B', key[2] 'C', and so on. The way ASCII works is each letter has a number assigned to it, so 'A' - 'A' is effectively 65 - 65, which equals 0. 'B' - 'A' is 66 - 65, which equals 1, 'C' - 'A' = 2, etc. And that's it, my substitution program is complete and works as intended.

CS50x - Problem Set 2 - Substitution.

encryption, decryption, substitution

Problem to solve

int main(int argc, string argv[ ])

char encrypt(char c, string key)

Other Posts.