Debugging
This short explains several ways of debugging code in the CS50 IDE. First there is the help50 command. below, I’ve written some code that has a compilation bug. This is the output when we run the make command without help50:
Using the help50 command:
Now we see a human readable message. On line 10 it suggests using data type string instead of int.
eprintf, a function included in the cs50.h library, works just like printf but includes the line number with whatever you output. Very helpful for what it referred to as a “sanity check” in the course. You expect some output and just want to make sure that is actually what is happening.
Example:
I went over running debug50 in the previous post but wanted to mention something that Doug goes over in this videos short. When you start walking through your program you can go line by line and “dive” inside of functions and loops, or even “jump” out of them.
Here is a gif demonstrating the step-over button in action:
Now the step-into button:
Also try the step-out of button, which will send you back to the calling function. So if we’re in say_bye and we step out we’re back inside of the main() program.
Functions
Functions, otherwise known as methods, subroutines, or procedures are pieces of code that are contained and reusable using a much shorter command, or “call.”
Here is an example:
#include <stdio.h> void greet_family(); void main() { greet_family(); //imagine there is some code here //and we have to greet the family again //for some reason greet_family(); } void greet_family() { printf("Hi Mom!\n"); printf("Hi Dad!\n"); printf("Hi Brothers!\n"); printf("Hi Sisters!\n"); printf("Hi Uncles!\n"); printf("Hi Aunts!\n"); printf("Hi Cousins!\n"); }
The greet_family() function is going to run twice in our program. We’re pretending there is some code in between that requires use to greet the family again at the end.
Notice how main() is calling on greet_family to do some work. Also notice how greet_family() describes it’s function. It says hi to everyone in the family. If we ever need to make a change now we only have to change it in one place, even if it’s used in many places within our program.
Functions can also be given inputs by using arguments. An argument works like this:
add_two_numbers(5,5);
So I have a function add_two_numbers(5, 5) and instead of having nothing between the parenthesis I have 2 arguments, 5 and 5. By the function name you can guess what is going to happen. Somehow the function is going to output 10. Functions are able to give something back to the caller by using a return statement.
#include <stdio.h> int add_two_numbers(int x, int y); void main() { printf("%d\n",add_two_numbers(5, 5)); } int add_two_numbers(int x, int y) { return x + y; // returning the result back to main }
It’s important to keep functions simple and not over complicate what they can do. Having a function with too many responsibilities can make it difficult to modify later on.
#include <stdio.h> void add_x_mult_y(int, int); void main() { int x = 10; int y = 5; add_x_mult_y(x, y); } void add_x_mult_y(int x, int y) { int number = 100; printf("%d\n", x + number); printf("%d\n", y * number); }
This program has a function that is doing too much, adding one number than printing it, then multiplying another number and printing it. In many cases functions just return a value back to the caller, so the printf would live in main(). Also notice that the number that is being used in both expression is the variable number. What if I just want value x to change? If I change the number value, it’s going to change both x and y. When we change our functions, it’s usually a good sign when the modified function only changes a single thing in your program, and doesn’t have far reaching consequences all throughout the program. I haven’t created a program that is 10,000 lines long yet, but I’m sure it would be a nightmare if it was full of giant functions.
I also want to add the function above also has a less than desirable name. What if I just want to add? What would work better is one function for adding and one for multiplying, using 2 numbers instead of one, and it would just return a number:
#include <stdio.h> int add(int, int); int mult(int, int); void main() { printf("%d\n", add(5, 5)); printf("%d\n", mult(10, 10)); } int add(int x, int y) { return x + y; } int mult(int x, int y) { return x * y; }
Here we have simple functions that are easy to use and modify. Also note the return types have changed from void to int. When we declare the function at the top and when we describe the implementation at the bottom, we have to include the return type, just like we do with main().
Also note the function declaration at the top has the arguments as int, int where as the implementation at the bottom is int x, int y. Variable names are not required when we declare the function at the top (this is known as a function prototype).
Variables and Scope
It’s important to understand how variables behave when you’re working with functions. Let’s make simple sandwich maker program.
When you are in some point of your program, let’s say inside of main() , and you get to the order_turkey(meat) line, this will send you out of main, or outside of the brackets { } where the main code lives, and into the code that lives between the order_turkey brackets. When we put the variable meat inside of order_turkey’s parenthesis, we didn’t send meat to the function, hidden from out view is a process where meat is copied, which is why I’ve named the order_turkey argument m instead of meat. Now once we are in the order_turkey function it will change m which has the same value of meat, “ham,” and change m to “turkey.” Meat never gets changed, it stays ham because the function works on a copy not the original variable.
This is the idea of scope. Each function is it’s own walled off space, if you pass something you pass by value in our case, which is essentially making copies and passing them around.
So how do we get turkey in the meat variable to change the sandwich?
We need to either assign the original variable to the function’s return value, or create a new variable and assign what’s returned to it.
When we talk about the relationship of passing these values we say, in this situation, main() is the function caller and order_turkey() is the function calle. The function order_turkey() returns a value to the caller.
This relationship always applies. So if I had a function call inside of order_turkey, then order_turkey would be the caller etc etc.
Let’s fix our program:
Finally there is something known as a global variable. If you declare a variable outside of all functions including main() like this:
I want to really drive this point, on line 25 of the example above, we return global. This is actually doing nothing for us in this program. On line 16 we call my_function(), however if we return something it’s supposed to be assigned to something. We’re not assigning this to anything. It’s not like we said my_variable = my_function(). So what happens here? Because global is a global variable, when we assign it to “BATMAN,” the change is recognized within every function’s scope, so we don’t actually need to return anything back to the main() function. The code below has the exact same outcome as the one above. Can you see how removing the return changes nothing in this situation?
This might be confusing, but although we can access global variables, they still behave the same way variables inside of functions when passed as an argument. Notice above we didn’t pass an argument to my_function(). What if we pass global as an argument? An argument in this case passes by value, or you can think of it as a copy that gets returned.
I would definitely recommend going into either the CS50 IDE or an online c repl and playing around with the code. Try tweaking it to see if can predict the outcome of the code before you run it. This will help make these concepts stick and feel more natural to you as you code.
Arrays
Arrays are a structure used to group together data of the same type, in a contiguous memory location. Within this piece of memory, equal sized blocks are known as elements of the array. Each element can store a certain amount of data, and have to be all of the same type like int, float, char etc.
Just like a block of houses has a mailbox number for each house, elements in an array have an index number. So if you wanted to get the value of an element, instead of trying to get to the data directly, we use the array value’s index access the data.
Some example code:
Notice array indexing starts at zero. Getting to the last element is always the length of the array minus 1. If we have 10 elements in some_array, the last element is some_array[9] or some_array[some_array_length-1].
here are strong warnings about the ability to go beyond the bounds of arrays in C programs. I haven’t been bit by this, probably because I haven’t written any large, or important programs. You can touch memory you’re not supposed to touch in C. If I have an array that has allocated 4 locations in memory and I say give me the 4th element (some_element[4]), since the array index starts at 0, i’m really asking for the 5th element. This post clearly explains the dangers of going out of the bounds of an array. I didn’t know in some cases you can actually break your hardware!
You can create as many dimensions to an array as you want. We can create a 2d array, similar to a board of chess:
int my_chess_board[8][8];
This board is really just a one dimensional array of 64 indices, however this abstraction helps us represent a 8X8 grid of values.
You can’t copy an array to another, even if they are the same type, with the same number of indices. The only way to copy one to another is to loop over each index, giving the nth value to the array you wish to apply the copy over.
Because arrays are so large they don’t follow the function rules we talked about in the previous article. While variables are passed by value, or copied, within functions, arrays are passed by reference which means the actual array is modified within a function call. This is because arrays are usually much larger than variables, and to copy an array can be expensive, in terms of computer resources.
Command Line Arguments
Our programs use the main() function, and like printf() can take arguments. We call the main program from the command line, so we have to pass those arguments from the command line. Our main function’s first argument will give us the total count of arguments, we’ll call it argc (argument count). We can name this anything, but this is a common name to use. Next we store each argument within an array of type pointer to char (which is a collection of chars, which is a string). This argument will be called argv[] (argument vector, which means a one dimensional array). It will be the size of however many arguments we give it on the command line. So now we have:
int main(argc, char *argv[])
Let’s create a program that does something like this:
./myprogram john jane bob you are running the program "myprogram" 3 arguments have been passed the first argument is john the last argument is bob
In this case I named the program “test.”
Some things to keep in mind, argv will hold all arguments as strings, which means if I pass 42 from the command line, argv holds the value as a string not an integer.
If we try to access argv[argc] we have an off by one error, which means we are outside of the bounds of our array, in some unknown space of memory in our computer. This can lead to the program crashing and random pieces of data being returned in some cases. A segmentation fault can also occur, which is when the computer’s hardware detects a program trying to access memory that is restricted. In some cases this is called a general protection fault. The OS will send a signal that usually causes the program to terminate.
Magic Numbers
Sometimes our programs contain a value that never changes, it is used constantly throughout our program. We don’t want to have to change this in every place if it ever needs to be changed. It is also ambiguous if the value is a number. For example pi is 3.1415, and if we used the number everywhere it might not be completely obvious what that value represents. Using the #define directive (or macro) we can replace the number with a representation of it like PI. Using all caps is common practice for symbolic constants. In fact we can use this as a find and replace for other data types as well including strings:
#define BEST_BURGER "5 guys"
Here is what a magic number might look like in our program:
#define PI 3.14
If we needed to calculate the radius of a circle in our program we used to have to do this:
float circumference, radius, pi; pi = 3.14; circumference = 15.5; radius = circumference / (2 * pi); printf("Radius is equal to %.2f\n",radius);
You might be wondering why create a magic number when we could just use a variable. The issue is a variable is mutable, so at any point in our program we can change it’s value. A macro definition using #define is immutable after it has been declared, so we know if it ever has to change, we only have to change it in one place.
Using #define:
Important to remember because macros are immutable, you can not do something like this:
printf("Adding one to pi: %.2f", PI++);
Use macros when you have an arbitrary value all over your program. Macros add context, helping you understand what is going on when you revisit your code.
Finally finished another article. I have been slacking the past 2 weeks. I’ve since updated my calendar to put aside more time in writing. Creating content has been so valuable in solidifying what i’ve learned in the course. The last github gist in this article was written by me in one pass, with no errors!
Onwards through to the challenges!