This is the eleventh article in the Making Sense of C series. In this article, we're going to come up with some way of telling the computer where to start executing commands and some way of getting user input somewhere into your program.
In the last article, we established both the syntax of defining and calling
functions in C
, but we need some way of getting user input from the command
line and we need some way of telling the computer which function to start from.
In this article, we'll establish the main
function, which serves as an entry
point in C
and we'll write the basics of our first program.
Since I want us to get the first program written soon, I'm going to gloss over a
few things for now, such as the distinction between the stack and the heap, how
your computer actually allocates variables, dynamic memory allocation, etc.
These topics are important for understanding how to program well in C
, but
they aren't strictly necessary for the first program.
Anything more complicated than a simple program will require you to understand
these topics.
There are several possible ways to tell the computer the first statement to
execute.
Generally, interpreted languages start with the top of source code file, read
through it, and execute each line from top to bottom.
For example, an oversimplified python
file would look like
1 2 | print("Alphabet") print("Soup") |
and the python
interpreter would start at the top, see print("Alphabet")
,
print Alphabet
to the terminal, see print("Soup")
, print Soup
to the
terminal, then exit the program.
You would execute the program using
user@computer:~/dev$ python3 basic.py
where basic.py
is the name of the file.
Just reading the code from top to bottom won't work in C
since C
is a
compiled language with multiple compilation units (source files), which I'll
explain later.
Instead, we need to use some label to tell the computer that you want it to
start with a specific block of code.
main
FunctionIn C
, we use the main
function to start the program and it has two possible
ways of writing it.
If you don't want to take in any user input on the command line, you can use
int main(void) { // Do stuff }
On the other hand, if you want to take in user input on the command line, you can use
int main(int argc, char ** argv) { // Do stuff }
C
For now, we're working on C
on its own and not part of a framework.
Furthermore, we aren't taking in any environment variables.
In either of these cases, the entry point in C
can be different.
For example, both POSIX (Linux and Mac) and Windows support a third way of
writing main
:
int main(int argc, char ** argv, char ** envp) { // Do stuff }
where the last argument is a list of all the environment variables. Environment variables contain things like your current directory, where all your command line executables are.
Furthermore, other programs may use a different name for the entry point, such
as WinMain
for some Windows programs.
Once again, we are just going to use the standard main
function since we're
writing freestanding C
programs.
Let's break these functions apart, starting with the easiest to explain: main
.
main
?Since we need a name for our entry point function, we have to choose a name that
makes sense.
The shortest word we can use to indicate that the function is the center of our
program is main
.
We could have made it start
or begin
or something else, but we decided to
stick with main
.
int
?Let's say that you need a file to exist or else your program cannot run.
For example, let's say that you provide the name of a file that doesn't exist to
our program to count words.
We can't count words in files that do not exist, so what should we do?
In our case, we can print out an error saying "Error: File does not exist!"
and the user can figure out what to do from there, but what happens if we have a
program whose output cannot be read by a user, such as a server that needs to
run automatically or an embedded system that doesn't have a screen?
If we knew the error, we could write some more code that handles it for us.
For example, if the file does not exist, then might run a different program to
figure out why the file doesn't exist.
We need some way of telling the other program that the file did not exist.
The simplest way to do so is to return an int
with some value that indicates
either that everything went smoothly or what went wrong.
For example, let's say that main
will return a 0
if nothing went wrong and a
1
if our file was missing.
If we have other errors, we can use other numbers.
For example, if someone gives us a value that we cannot use as one of our
inputs, let's return 2
.
Eventually, we'll get to some standard error codes in errno.h
, but don't worry
about it for now.
In our case, we just need to put return 0;
at the end of the main
function,
since we're not going to use them that often.
In short, the value returned from main
tells the computer how the program
ended.
argc
?When you execute a command on the command line, it generally looks like
user@computer:~/dev$ command first_argument second_argument third_argument
Without going into too much depth on the syntax of the command line, argc
is
the number of words you type in, where each word is separated by whitespace.
If we were to type in the command command a b c d
, argc
would be 5
.
If we were to type in the command command
, argc
would be 1
.
For our first program, we just need to make sure that our users have given us
the program name, the file the user wants to read, and the word that the user
wants to check, so we need to make sure that we have at least three arguments.
If we don't have at least three arguments, we should just exit the program and
print out something like a usage message.
Since we now have something our program needs to do, we can start writing our
main
function.
1 2 3 4 5 6 7 8 | int main(int argc, char ** argv) { if (argc < 3) { // TODO: Print usage message return -1; } // TODO: Count the number of words return 0; } |
We still don't have the functionality to print out the usage message and we haven't written the code to count the number of words, but we'll get there.
char **
?As you know, we're using a char *
to represent strings, so what is a char
**
?
When we represent lists, we generally allocate a contiguous block of memory and return the memory address of the first element of the block of memory. For example, we represent a string as a list of characters, so we use the syntax
char string[] = "This is an example.";
Immediately after allocating the memory, though, string
is a char *
that
holds the memory address of the first character.
We access each individual character using the syntax string[offset]
, which is
a shorter version of *(string + offset)
.
Likewise, we said that if we wanted a list of numbers, we could use the syntax
int list_of_integers[] = { 32, 75, 85, 44 };
Immediately after allocating the memory, though, list_of_integers
is a pointer
that holds the memory address of the first number.
We access each individual number using list_of_integers[offset]
, which is a
shorter version of *(list_of_integers + offset)
.
So what if we want a list of strings?
Well, the general syntax for declaring a list of anything is type
array_name[count];
and the type of a string is a char *
, so we can declare a
list of strings using the syntax.
char * list_of_strings[] = { "First string", "Second String", "Third String" };
But remember that after you declare an array using type array_name[count];
,
the variable array_name
will be type *
.
So if you were to declare an array of char *
, the type of the array variable
would be char **
.
In short, argv
is a char **
because it is an array of strings, specifically
each word on the command line.
For example, if you were to type
user@computer:~/dev$ command first_argument second_argument third_argument
argv
would contain { "command", "first_argument", "second_argument",
"third_argument" }
.
You can then use it like a normal array, where argv[0]
would be "command"
,
argv[1]
would be "first_argument"
, etc.
In general, argv[0]
in main
is the name of the executable.
main
Function in Our ProgramAs we've discussed earlier, we've already come up with something to check to make sure that we have enough arguments, but we also need some standard way to read the arguments. In our case, let's make the first argument the file and the second argument the word we want to count.
int main(int argc, char ** argv) { char * program_name = argv[0]; if (argc < 3) { // TODO: Print Usage Message return -1; } char * file_name = argv[1]; char * word = argv[2]; // TODO: Count number of occurrences in a file return 0; }
All we did is give names to the arguments given to the function, but now we've taken care of everything related to the main function and now we can write functions for the usage message and counting the number of occurrences in a file and we're done.
In this article, we learned about the main
function, which allows us to get
user input on the command line and serves as the entry point in the program.
Here's our complete To Do List with everything we've completed up to this point:
//
for single line comments and /*
and */
for multiline comments+-*/%
for arithmetic[type] [variable] = [expression]
which will allow us to store values for later usechar
, short
, int
, and long long
) and the floating point types (float
and double
)char
type and invented the '\0'
character, which indicates that we're ending a stringchar
&
)*
)type * variable_name;
type array[num_elements];
char
array using double quotes ("Hello!"
)variable_name[offset]
if else
statementsfor
and while
C
void
type and two ways to use itmain
function as the entry point of our program and the way for us to take in command line argumentsmain
functionAs you can see, we've covered quite a lot of ground. We can now write complete programs that can mess with user input, but we currently have no way for us to see the output of any of our programs. We can't print our output to the terminal and we can't save our output to a file, so we'll need to implement those features soon.
In the next article, Header Files in C, we're going to discuss how your compiler uses the symbol table and how you can use header files to include code in your project.