Functions in C

Functions are the building blocks of programs in C.

Posted 15 January 2020 at 1:26 PM

By Joseph Mellor

This in the tenth article in the Making Sense of C series. In this article, we're going to come up with ways to group together statements in code to make it possible to reuse code without copying it, which will improve the legibility of our code and reduce the spread of bugs.

So far, we've

determined that we're going to give the compiler a file with a bunch of statements ending in semicolons,
established that we can use comments with // for single line comments and /* and */ for multiline comments,
reserved the symbols +-*/% for arithmetic,
set up variables [type] [variable] = [expression] which will allow us to store values for later use,
come up with the integral types (char, short, int, and long long) and the floating point types (float and double),
figured out a way to represent characters using the char type and invented the '\0' character, which indicates that we're ending a string,
decided to use single quotes around a character to represent the ASCII value for that char,
explained how the program uses memory addresses to identify variables,
came up with a way to access the memory address of a variable using the address of operator (&),
came up with a way to access the value stored at a memory address using the dereference operator (*),
created pointer variables to allow us to store memory addresses using the syntax type * variable_name;,
came up with a way to tell the computer to get us a block of memory (a.k.a. an array or buffer) using the syntax type array[num_elements];,
came up with a way to initialize an array with an initializer list,
came up with a way to initialize a char array using double quotes ("Hello!"),
came up with a way to access elements of an array using the syntax variable_name[offset],
added the ability to use conditional branches through if else statements,
and added the ability to use loop through something with for and while.

In the last article, we established control flow, which is the collective term for conditional branches and loops.

Technically, functions are also included in control flow, but I didn't include them in the control flow article because you could replace the function with the code in the function and get the same exact result.

We also used control flow to come up with some code that can do different things depending on whether two strings match:

int i = 0;

while (str1[i] && str2[i] && (str1[i] == str2[i])) {
    i += 1;
}

if (str1[i] == str2[i]) {
    // Do stuff you would want to do if the two strings match
} else {
    // Do stuff you would want to do if the two strings don't match
}

We'll modify it in this article, but not significantly.

With just what we have up to this point, we have enough functionality in the language to make any program, but doing so would be tedious for several reasons, most notably the lack of functions and the lack of a standard library. In this article, we're going to add functions to C.

What is a Function?

A function in mathematics takes in an input and produces an output, with the rule that the same exact input produces the same exact output. Functions in C work just like functions in math, though the input and output are slightly more complicated. To the computer, a function is a set of instructions that the computer will jump to, execute, and jump back from.

Function Syntax

Since we used the general syntax

keyword (input) {
    code to execute;
}

for the other elements of control flow, let's keep it with modifications. We know that we're going to put the code to execute between the curly braces, but functions also have inputs and an output. Like with if statements, for loops, and while loops, we put the input between the parentheses but we need somewhere to put the output. We also need to know the types of everything and we need to be able to provide multiple inputs. Inside the function, we'll need to tell the computer what we want to return from the function, so we'll use the keyword return with the syntax return output;. Finally, when we want to execute the function, we just need to specify the name and the inputs.

The syntax for calling a function is

function_name(first_input, second_input);

If we want to store the output, we can use

variable = function_name(input);

The syntax for declaring a function with one input is

output_type function_name(type input) {
    first statement;
    second statement;
    // and so on
    return output;
}

The syntax for declaring a function with two inputs is

output_type function_name(first_type first_input, second_type second_input) {
    first statement;
    second statement;
    // and so on
    return output;
}

I'll let you guess the syntax for three inputs.

We can also have functions that take in no input using the syntax:

output_type function_name(void) {
    first statement;
    second statement;
    // and so on
    return output;
}

We can have functions that return no output by making the output type void:

void function_name(void) {
    first statement;
    second statement;
    // and so on
    return;
}

The `void` Type

The void type is different from all other types because you cannot make a variable of void type and you cannot use a void type in any of the operations we've established. It might seem pointless, but it has a few specific usages. Returning nothing from a function and telling the compiler that the function doesn't take inputs are two uses. Later, we'll cover the other main use of void.

As you would guess we can use the inputs to our function in our function. Inputs and the output can be any of the types we've established up to this point, including pointer types such as char * and int *.

`return` Statements Exit the Function Immediately

I want to emphasize that your function will exit as soon as it hits a return statement. For example:

unsigned long long smallest_fibonacci_above(unsigned long long number) {
    // 12200160415121876738 is the largest Fibonacci number we can represent in
    // 64 bits, so if someone gives us a number greater than it, we will have an
    // infinite loop below, so we just return some value to indicate that we
    // cannot calculate the required Fibonacci number
    unsigned long long max_fibonacci_number = 12200160415121876738;

    if (number > max_fibonacci_number) {
        return 0;
        // Anything after this point inside the surrounding curly brackets will
        // not be executed since the program has already left the function
    }
    unsigned long long previous, current, next;
    previous = 0;
    current = 1;
    while (current < number) {
        // The next three lines are the definition of the Fibonacci sequence
        next = previous + current;
        previous = current;
        current = next;
    }
    return current;
    // Anything after this point inside the surrounding curly brackets will not
    // be executed since the program has already left the function
}

If the value of number is greater than 12200160415121876738, then the function will immediately exit with output 0 without even looking at the other code, otherwise, it will calculate the smallest Fibonacci number above the number you give the function.

By the way, here's the code without comments so it's easier to read:

unsigned long long smallest_fibonacci_above(unsigned long long number) {
    unsigned long long max_fibonacci_number = 12200160415121876738;
    if (number > max_fibonacci_number) {
        return 0;
    }
    unsigned long long previous, current;
    previous = 0;
    current = 1;
    while (current < number) {
        next = previous + current;
        previous = current;
        current = next;
    }
    return current;
}

Pass by Value

Lastly, I want to make sure that it's clear that when you call a function, you create copies of the inputs and store them into whatever variables you use for the parameter.

unsigned long long example = 100;
unsigned long long result = smallest_fibonacci_above(example);

Notice that I passed in example to the function even though the function definition uses number as the parameter, and the variables will refer to different memory. The value in example will be copied into number, meaning that if you had a function like

void change_value_to_four(int number) {
    number = 4;
    return;
}

and you use it like so

int random_number = 73;
change_value_to_four(random_number);

random_number is still 73, because random_number is copied into number, and, just as modifying a copy of something doesn't affect the original, neither does modifying number. You should have two questions:

Why does C copy the input into a variable?
How can I give a function some data and have it modify the data?

We can answer both of them easily.

Why Does `C` Copy Input into Variables?

I can actually show you with one line:

change_value_to_four(73);

Remember that plain numbers are rvalues, which means they neither have a memory address nor can they be assigned values. If C didn't copy 73 into a variable, this program would make no sense because we can't store things without somewhere to store it.

Modifying Data Passed Into Functions

Functions will always copy their inputs into their own local variables, so we can't pass the value as is. In other words, if we want to pass in an int to a function that we can modify, we cannot have it come in as a parameter as an int like so

void example_function(int input) {
    // input is a copy of whatever you passed in, so you can't modify it
}

You're going to need to modify the int input section of the code. If you look through everything we've come up with so far in C, you should be able to figure out some way we can tell the computer what memory we want to modify. For now, I'm going to leave this as an exercise for you to figure out on your own. I'll cover it in the future, but just go through all the features we've covered up to this point and figure out which feature can help us. For example, comments cannot help us because the compiler will ignore them and arithmetic operations cannot help us because any arithmetic we do within the function will be done on the local copy, not the original data.

Functions in Our Programs

For us, we'll need functions to:

check if two words match,
read a file word by word,
print the output,
get the word the user wants to be counted,
and get the filename from the user.

Since reading from a file is dependent on your file system and printing output is dependent on your terminal/console/command prompt, you as a programmer shouldn't have to write these functions for yourself, so we'll include them in the standard library, which we'll discuss more in the next few articles.

Checking if two words match is common enough that the standard library also has some functions to handle it for you, but since almost all computers use the same character encodings, we can actually write the function ourselves. Of course, the authors of the standard library are going to write platform-specific code that takes advantage of how the computer works at a low level for each platform, CPU architecture, etc., so we probably shouldn't use our code for checking if strings match in code that we're going to publish. We'll write the function ourselves just to get some practice working with functions.

In our case, we'll convert our code to check if two words are the same into a function.

We'll need the two strings as input and a name for the function. Since we're checking if two strings match/differ, let's call it check_if_strings_differ. We could have also done check_if_strings_match, but check_if_strings_differ is more consistent with how the standard library works, so we won't have to change our code in the future once we replace our function with the standard library.

We'll also need to figure out what exactly we want the function to do so we can figure out exactly what code we should write for the function. In our case, we want to follow the Single Responsibility Principle, which states that everything in your code should do one and only one thing.

Single Responsibility Principle

To understand why the Single Responsibility Principle is so important, imagine if you had to go to the store whenever you pick up your children because you had to buy groceries and pick up your children once in the same day. Such a situation never happens in real life because we treat these as separate tasks, but novice programmers often combine tasks into one giant task unnecessarily because they don't follow the Single Responsibility Principle.

As an example, let's say you're writing a program that has a screen that allows users to log in at some point in the program. You'll need to display a screen that shows a box for a username and a box for the password and you'll need to check if the password given matches the username given. You should probably not have a singular function that checks if the username and password match and displays what the user sees because now if you need to change either the code for logging in or the code for displaying what the user sees, you will need to waste time figuring out which lines of code are relevant to what you want to do. Furthermore, changing the login checking shouldn't change what the user sees after attempting to log in and changing what the user sees after logging in shouldn't change the login process, but by putting them in the same function, you're coupling them together.

They should be two separate functions passing relevant information back and forth between each other. To check if a username and password match, all you need is the username, the password, and some way of checking if they match. You do not need to know where the username and password input boxes are on the screen, the color of the background, etc. Likewise, the function to display the screen doesn't need to know anything about how to determine whether the username and password match, just whether or not the username and password match.

To be clear, there might be some security things you need to take into account that would make you write a program differently, but you would still follow the Single Responsibility Principle.

For example, we'll want to check if two strings match and do different things depending on whether they match or not. We have four responsibilities:

figuring out whether the strings match,
doing something if the strings match,
doing something else if the strings don't match,
and making sure that the other three responsibilities have everything they need.

The last responsibility is a little bit harder to explain, but it's the easiest to code: it's the if statement that takes the output for responsibility 1 and decides what to do based on the output.

// Responsibility 4
if (check_if_strings_differ(word1, word2)) {
    // Responsibility 3
    // Do stuff you would want to do if the words differ
} else {
    // Responsibility 2
    // Do stuff you would want to do if the words match
}

And check_if_strings_differ will take care of responsibility 1. So now, we've decided that check_if_strings_differ will return 0 if the strings don't match and 1 if the strings do match. If we kept our version with the if statement at the end, we would be combining responsibilities 1 and 4, which leads to code that's harder to maintain.

Anyway, we'll keep our original code but output str1[i] != str2[i] instead of using the if statement.

// Responsibility 1
int check_if_strings_differ(char * str1, char * str2) {
    int i = 0;
    while (str1[i] && str2[i] && (str1[i] == str2[i])) {
        i += 1;
    }
    return str1[i] != str2[i];
}

We've written our first function!

`if (condition_is_true) { return true; } else { return false; }`

Generally, when people first learn a programming language and they want to return a true or false value from a function, they write code like

if (a == b) {
    return 1;
} else {
    return 0;
}

which is valid, but verbose because what you want to return and the result of a == b are identical. You could just as easily write the code as

return a == b;

The title of this aside is how you would read the novice code with the if statement, which should make it clear why you can just use the return statement.

Summary

In this article, we set up the basic syntax for functions in C, we introduced the void type and two ways to use it, and we introduced a way to call functions. We also wrote our own function to check if two strings differ.

What's Next

We're almost to the point where we can write our first program! Furthermore, because of the way we've written these articles, once we can write a complete program, we'll be able to write other programs quickly and easily.

We just need to know how to

establish some form of an entry point,
set up and use a compiler,
print something out to the terminal,
and read from a file.

Remember that we can write functions that can be executed at any time, but we need some way to tell the program which function to run first, i.e. establish some form of an entry point. We also need some way of getting user input from the command line (later, we will likely cover different graphical user interfaces). In the next article, The main Function in C, we'll discuss the main function in C and how we can use it to take in user input.