Function, Recursion, and Call Stack

Share this post:

Modular Program Development

  • The best way to build a large program is to build it from small independent pieces that can be joined together. These pieces are called modules.
  • Each module can be coded and maintained separately from other modules. This is the basis of modular program development. The technique used for modular program development is aptly described as divide and conquer.
  • Modularity allows for easy management of complexity by creating separation of concerns because each module is concerned with a distinct problem (or set of problems).
  • Poor management of complexity leads to a large over-budget program that is of substandard quality.
  • A function can be a module, and this allows for modularization of C programs. This language supports this type of modularization by providing prebuilt C Standard library functions that can be easily linked to user-created functions. This Standard Library is part of the C Standard, and is normally implemented by all standard C compilers.
  • The Standard Library functions are securely coded, rigorously tested, and have been optimized for performance. For this reason, it is not recommended that one writes a new function to replicate an already existing function.
  • In the Standard Library is the math library functions that perform mathematical operations such as calculating the square root (its function is named sqrt), cube root (cbrt), exponentiation (exp), natural logarithm (log), floating-point modulus (fmod),and the absolute value of a fractional number (fabs).
  • Related functions can be grouped together under a single library header. This shows that the library header contains a collection of functions.
  • The rand function in the Standard C library is located in the <stdlib.h> header, and is used to introduce the element of chance into a program.
  • The programmer can create his/her own custom library header, and include it in the program using the right preprocesser directive. However, this custom library header is enclosed in double quotes instead of angle brackets <> which are reserved for standard library headers. This translates the syntax of the custom header as:

#include “custom_header.h”

  • Functions are portable, and this allows them to be used in multiple programs.
  • The key utilities of a function are:
  1. Support for Modular programming.
  2. Module portability. If the module is a function, then it can be copied into another program file. Instead of copying the function code, it can be invoked/called to perform a function is the program.
  3. Software re-usability: The function is a software code that can used in different program files.
  4. Logic abstraction.

Function Declaration

  • Like a variable, the function must be defined before it can be used.
  • The function must first be given an identification known as a function name or function identifier.
  • By convention, the name of the function must express or describe the task that the function is required to perform. This convention is based on the idea that a reusable function must be limited to performing only a single task.
  • If a single function performs multiple tasks and it needs to be reused in the program, then the programmer is expected to decompose this function.
  • Function decomposition is the process by which a function that performs multiple tasks is broken down into different separate functions with each function performing only one well-defined task.
  • The ability of a function to perform a task means that it has a set of executable statements that process input data and produce output data. This is the basis of abstraction of functions.
  • Abstraction describes the process or procedure of hiding processes (or details of these processes) so that multiple tasks can be treated as a unit. For example, the sqrt function calculates the square root of a value through a process of division (and several division may need to take place before the square root is found). The programmer does not need to know how the square root of the value has been calculated, but (s)he knows that the square root is correct.
  • Understanding the task that the function is to perform allows one to appropriately name the function and identify the data types that this function will work with.
  • The data type that the function can work with is identified before the function name as follows: data_type function_name.
  • The function needs to be given data to process. This data is called the parameter. The data type of this parameter must be identified as follows: data_type parameter.
  • The data that the function is to process is encased in parenthesis positioned to the right of the function name as follows: data_type function_name (data_type parameter), and this is called the function header. If this header is terminated with a semicolon then it is becomes a function prototype. Exclusion of the semicolon from the prototype results in a syntax error, unless the semicolon is replaced with curly braces.
  • The syntax of the function prototype is:

data_type function_name (data_type parameter);

  • The function prototype allows for function calls to be validated, thus explaining why syntax errors like a missing semicolon can be identified by the compiler.
  • A library header e.g stdio.h contains a collection of function prototypes.
  • The parameter needs to named so as to give a generic identification that can be associated with its data type. For example, in the sqrt function, the data provided is usually whole numbers (whose data type is int) and this data can be generically named as square, and thus the parameter and its data type can be written as: int square. The function prototype of the sqrt function can therefore be written as:

int sqrt (int square);

  • A function can take a list of parameters, and these parameters need to be separated from each other by commas (,) as follows: (data_type parameter1, data_type parameter2, data_type parameter3). Each parameter in the list must include its data type, and exclusion of this type causes a compilation error.
  • A function with many parameters may need to be decomposed.
  • The parameter of a function is described as its local variable. As expected, the variable must have an identifier, and it is for this reason that the parameter is given an identifier/name. The parameter is considered a local variable because it can only be accessed and processed inside the function e.g the parameter square in the sqrt function above can only be accessed inside this function, thus square is a local variable of the function sqrt.
  • The naming conventions of a variable identifier apply to the parameter identifier.
  • The compiler normally ignores the parameter name, but is strict on the parameter type. Giving the parameter the same name as its local variable minimizes ambiguity during code debugging. For example, the parameter named square should be the same variable named square in the body of the main function.
  • The function is required to complete a task or set of related tasks. This task is executed by an executable statement. This statement is enclosed in a pair of curly braces that is appended to the function header as follows:

data_type function_name (data_type parameter) {executable_statement}

  • The executable statement or set of statements form the body of the function.
  • If a function has only one executable statement, then this statement is the return expression of the function. It is called so because it starts with the return keyword and then the executable statement or expression. The expression can be an operand or a combination of operator(s) and operands. For example, the return expression of a function that squares its parameter is: return parameter_name * parameter_name;. The return expression is terminated with a semicolon just like any other executable statement.
  • The return expression uses the output of the expression as the value of the return.
  • The return keyword can be a standalone single-word statement, and it is written as return; which implies that the return does not return any value to the function. If the body of the function does not contain any return expression, return; is considered the default return.
  • If the body of the function has multiple executable statements, then the statements are written before the return. Likewise, the executable statement can be written before the return. The output value of the executable statement can be used as the expression in the return expression.
  • The data type of the return is the data type of the function. This means that the return data type is the data_type of the function_name.
  • Function definition is the combination of the function header and the body of the function. This means that function definition is written syntactically as:

return_data_type function_name (parameter_data_type parameter) {executable_statement; return expression;}

  • Definition of a function is also known as function declaration.
  • If the function does not receive any parameter value, then the keyword void is used as a substitute for the parameter. If the function does not return any value, then the keyword void is the return type (i.e return_data_type). A function with a void return does not need to have a return statement/expression in its body.
  • Function prototype can be written before the function declaration is made.
  • The definition of a function is hidden from other function during function calls, and this is due to abstraction.

Function Call

  • A function is invoked by its name (or identifier), thus availing this function to perform a task. This invocation is called a function call.
  • The function call must be supplied with data in form of arguments, which it must process to produce an output.
  • This means that function call is made of 2 components – the identifier of the function, and the arguments of the function. The arguments are enclosed in parenthesis which are placed to the left of the identifier as follows:

function_identifier (argument1, argument2)

  • When this function call is made as a statement, then it is syntactically written as:

function_identifier (argument1, argument2);

  • A function can make a call to another function. The function that makes the call is designated as the caller or calling function. The function that has been called by the caller is designated as the called function.
  • The called function is part of one of the arguments of the caller’s statement. This can be represented as:

caller_function_identifier (argument1, argument2, called_function(argument3));

  • The argument can be an expression, a variable, a constant, or a called function containing its argument(s).
  • When the called function completes its task, it must notify the caller about this task completion. This notification is known as the return. Usually, if this return was a return expression, then it delivers the output of the expression as the return to the called function.
  • The bulk of calls to called functions is performed by the main function, which makes it the principal module of the program. Its return expression is normally return 0; upon successful execution of tasks.
  • When the parameter of the caller is used in the called function, then it becomes an argument in the called function. This means that the parameter of the caller is passed as an argument to the called function.
  • If there are several parameters in the function, then they form a parameter list. This list has parameters separated by commas, hence it is described as a comma-separated list. The order, type, and number of parameters in this list must match accurately with the arguments passed to this function. For example, if the function prototype is: function_identifier (int parameter1, float parameter2), then the only arguments that can be passed to this function are argument1 which must be an integer and argument2 which must be a float value.
  • A function cannot be defined in the body of another function. Doing so results in a syntax error. This also means that a called function cannot be declared in the body of a caller function. Equally, called functions cannot have their declarations nested inside a calling function.
  • If these called functions are custom functions, then their function prototypes must be written before the caller.
  • During function call, the function header of the called function determines the (data) type of arguments that it can receive.
  • In argument coercion, the data type of the argument that has been passed to the function is changed to the data type of the parameter in the function header. For example, an int argument is changed to a double argument by the sqrt function. This ensures that the parameter type matches with the argument type.
  • Argument coercion can introduce errors e.g a double argument type being converted to an int argument type leads to data loss due to truncation. To mitigate against this, the programmer should follow the arithmetic conversion rules of the latest C Standard.
  • If an expression contains values of different data types, then it is called a mixed-type expression.
  • Arithmetic conversion rules are applied to the values in the mixed-type expression so that they have the same data type. The compiler handles the usual arithmetic conversion rules which demand that all the values are converted to the highest data type in the expression. This implies that data types have a hierarchy, and they are as follows:
HIERARCHYDATA TYPECONVERSION SPECIFICATION
Floating Point Number


Long Double%Lf

Double%lf

Float%f
Integer


Unsigned Long Long Int(eger)%llu

Long Long Int%lld

Unsigned Long Int%lu

Long Int%ld

Unsigned Int%u

Int%d

Unsigned Short%hu

Short%hd

char(aracter)%c
  • The conversion of a data from a lower type to a higher type is called promotion or type promotion. The rules used to perform argument coercion that results in type promotion are thus designated as argument promotion rules.
  • To convert an argument from a higher type to a lower type, then a cast operator is used.
  • The values in the arguments are converted to the data type of the function parameter, and this can lead to a higher data type being converted to a lower data type e.g a double value is converted into a int value. As mentioned, this leads to errors.
  • When converting a value, the compiler first makes a temporary copy of the value and then converts the type of this copy. This leaves the initial/original value intact.

Call Stack, Stack Frames, and Stack Overflow

  • Data Structure is a collection of related data items that have been grouped together in an organized format to allow for easy access of the stored data, as well as addition or removal of data items.
  • Function calls are items that can be grouped together to form a data structure called a function call stack. In this stack, the first call made by the caller is forms the base of the stack and the next function call made by the called function is placed atop the first call. This ordering of calls in the stack allows for sequential execution of the functions in a program. For this reason, the function call stack is also called a program execution stack.
  • In the call stack, a new function call is placed at the top of the stack, and this is called pushing a call onto the function-call stack. When the called function executes its statements and returns control to the caller function, its (i.e called function) call is removed from the stack, and this is described as popping the call off the function-call stack. This ensures that the most recent called function executes its statements before program control is returned to its caller function. In effect, it ensures that the last function call is the first call that is popped off the stack, which makes the call stack as Last-In, First-Out (LIFO) data structure.
  • Each called function has its own local variables that need to be created and maintained when the function is being executed, and then destroyed upon completion of execution.
  • When a new function call is made, a new entry called the stack frame is made and added to the call stack. The stack frame contains the return address of the caller which ensures that when the called function completes its execution, it returns control to the caller function. It is this stack frame that is popped off the stack when the call function completes its task and successfully returns program control to the caller. This ensures that the function that is currently executing a task is at the top of the call stack.
  • The stack frame reserves memory for the local variables of a function. This ensures that local variables of the calling function are maintained when the called function is executing its task. When the called function returns control to the caller and its stack frame is popped off, then its local variables are lost and are not known to the caller. This explains why local variables in the called function are not recognized by the caller function. This is because the variables are located in a different stack frames.
  • The memory space allocated to the function-call stack is finite. If many function calls are made so that many stack frames are added to the call stack until they exceed the allocated memory, then a fatal error called stack overflow occurs and the program terminates prematurely.
  • A variable in the calling function can be passed to the called function as an argument. This argument can be passed in two ways:
    • Pass by Value: This allows only the value of the variable to be supplied to the called function. To do this, the compiler first makes a copy of the variable’s value and then gives this copy to the called function. If the called function changes this value, then this change is not reflected in the value of the caller’s variable. By default, all calls are passed-by-value unless explicitly declared as pass-by-reference.
    • Pass by Reference: The identifier of the caller’s variable is supplied to the called function. This allows the called function to modify the value of the caller’s variable. This can introduce errors described as side effects in the program, and for this reason, this method of passing arguments must only be used for trusted and reliable called functions.

Storage Class

  • The variable has 8 attributes: identifier(or name), value, (data) size, storage class, linkage, (data) type, scope, and storage duration. The identifier, type, and value attributes have been explained.
  • Storage duration is the period from initialization of a variable into memory until its removal from memory i.e the period when the variable exists in its memory space. The initialization of the variable and its occupation of memory space is described as value creation because it is the value of the variable that is stored in the memory. The removal of this variable from memory is described as value destruction. The storage duration is also called the lifetime of the variable.
  • Storage duration is specified as either automatic storage duration or static storage duration.
  • Automatic storage duration describes period when the variable exits in the memory space of a stack frame (of a function). This means that the variable is initialized when the function is called, and its value destroyed when the function returns control to its caller. It is specified by the keyword auto which is written to the left of the (variable) identifier as follows:

auto data_type identifier = initialization_value;

  • Static storage duration means that the variable is initialized and exists throughout the period of program execution, and is only destroyed when the program is terminated. It is specified by the keyword static or extern. By default, its initialization value is 0.
  • Scope describes where the program can reference the identifier of the variable. If the variable is created inside a function, it can only be referenced inside that function, and this scope makes the variable to be described as a local variable. By default, it is assigned automatic storage duration, and the keyword auto is normally omitted during variable declaration.If the variable is created outside the functions of the program, then it can be referenced by any function throughout the program, and this variable is described as a global variable. By default, it is assigned static storage duration, and its keyword is extern, which is normally omitted during variable declaration. If the program has multiple files, the identifier located in a specific file can be referenced to in another file through the use of linkage. Linkage uses declarations to ensure that the identifier is properly referenced in the program.
  • If the local variable needs to exists throughout the execution of the program, then it needs to be given static storage duration. This is done by assigning it the keyword static, which is written to its left as follows:

static data_type variable_identifier = initialization_value;

  • The static local variable retains its value when its function is terminated, and it does not have a global scope. The value of this variable is then passed to its function when the variable is invoked again.
  • The storage duration and scope (and linkage) form the storage class. The storage class uses one of 5 specifiers: static, auto, extern, register, and _Thread_local.
  • The specifiers static and extern refer to static storage duration, while the specifier auto refers to automatic storage duration.
  • The specifier register is assigned to a variable that needs to be stored in the memory space of the central processing unit known as registers. Like auto, it has a local scope, but unlike auto which is stored in the RAM, register is stored in the CPU which allows for faster processor access and voids the need for an initialization value. Therefore, the register variable can be declared without any initialization value (which results in it taking the garbage value). Likewise, this (register) variable must be declared at the beginning of their function. Its keyword is register, and is considered archaic.
  • Variable that is to be used inside only one function should be created as a local variable, while a variable that needs to be used by multiple functions needs to be created as a global variable.
  • The function has the above-mentioned 8 attributes. By default, functions are assigned static storage duration, which makes them their identifiers to have a global scope in the program.

Identifier Scopes

  • The storage scope is normally designated as the scope of the identifier which reveals that the identifier of the variable or function is used for referencing in the program.
  • Any variable in the program can be referenced only after it has been defined or declared.
  • The program portion where the variable can be referenced is the scope of its identifier.
  • The are 4 identifier scopes are:
    • Function Scope: The variable can only be referenced inside a function. In the switch statement, the identifier is written syntactically as identifier: (identifier & colon) and this is called a label. The label is the only identifier that has a function scope, thus it can be described as the only strict local variable. This strict local variable conforms to the principle of least privilege which states that the variable should only be given privileged access to program data that it needs to complete its task and nothing extra. For this reason, a function does conceal its labels from other functions, and this is called information hiding.
    • Function-Prototype Scope: In this scope, the parameter is defined in the header of the function. This parameter must have a type, but not necessarily a name. This allows the parameter to be accessed by all the contents of the function, including nested statements.
    • Block Scope: The variable that is defined inside the body of a function is described as having a block scope. This is because the body of the function is also described as the block of this function. The parameters of the function are considered to have block scope.
    • File Scope: The identifier is declared outside the function, and can therefore be accessed by all functions below its definition till the end of the program. Global variable, function prototypes, and function definitions/declarations have file scope.

Recursion

  • A function that calls itself is called a recursive function. The process by which this functions makes the call to itself is described as recursion.
  • Recursion can be either direct or indirect.
  • In direct recursion, the caller and the called function are the same function.
  • In indirect recursion, the caller that invokes the recursive function is a different function that was initially called by the recursive function. In the stack, the stack frame of the caller function is placed between the stack frames of the recursive function. This can be written syntactically as follows:

return_data_type recursive_function (parameter_type)

{

return_data_type caller_function (parameter_type)

{

return_data_type recursive_function (parameter_type) {body of function; return expression;}

}

}

  • The call that invokes the recursive function is known as the recursive call.
  • The recursive function can solve a simple problem called the base case, and then return its result. For this reason, the recursive function must have a return statement. The base case is also called the stopping case. If the problem satisfies the base case, then an output is returned to the function, and this is called completing the call or satisfying the call. If the problem does not satisfy the base case, then no return is made and in turn another call is made and the caller (calling function) is suspended.
  • If there is no base case, the recursion continues infinitely until the memory space is exhausted. This error is called an infinite recursion and it is a runtime error. Infinite recursion can also occur if the base case cannot be reached, and this error is known as the bad base case. This can create a situation where the problem does not get simpler with each recursion step, and this problem is known as bad decomposition.
  • Recursion is applied to complex problems. When the recursive function encounter a complex problem, it divides the problem into 2 conceptual parts:
    • A simple problem that it can solve. This is its base case. The function returns the result back to the caller.
    • A complex problem that it cannot solve. This problem resembles the original complex problem (though a bit simpler in terms of complexity), and this allows for the function to be invoked again. This call is the recursive call and it constitutes a complete recursion step. The function then divides this problem into 2 parts – a simple problem and a complex problem that it is unable to solve. To do this, the function needs the a recursive case. Therefore, the recursive case is a statement or expression that contains the recursive function in at least one of its arguments or operands.
  • The base case returns a value, while the recursive case decomposes the problem until it converges to the base case.
  • The base case is also called the trivial case because it a simple, non-recursive case that produces an output value. Likewise, the recursive case is called the complex case because it is created using the base case. Basically, this means that the recursive case cannot be developed unless the base case has been determined.
  • Likewise, because the base case does not decompose the problem, it is said to deal with the immutable problem; while the complex case is said to deal with the mutable problem.
  • The problem that the function cannot solve is divided by the function into 2 parts, and this occurs multiple times until the complex problem is decomposed into 2 simple problems (or it converges into the base case) that the function solves and exits the block. This means that as long as the complex problem has not been solved, recursion occurs and the function keeps breaking the problem into 2 conceptual parts.
  • The return from the called function is used by the caller to produce its return, and this process is called composition. The first composition is made when the base case returns its value to the recursive case.
  • Execution trace is the path through which the complex problem is reduced to the base case. This execution trace is graphically represented using a tree diagram. This diagram is called the recursion tree.
  • Recursive functions can return very large values that exceed the maximum integer limit of the integer type in C.
  • C is a procedural language that cannot be extended. C++ is an extensible language that allows for creation of new data types to hold extremely large values or non-standard data. The new data type is created through a feature called classes, which makes C++ an object-oriented language.
  • In object-oriented languages, the function is described as the method.
  • In each recursion step, the function breaks the complex program into 2 pieces, and the function evaluates each piece. If the same function is invoked more than once in the recursion step, then the number of function calls increases in an exponential manner in each subsequent recursion step. For example, if 2 function calls were made in the first recursion step, then the number of function calls made in the second, third, fourth, and fifth recursion steps are 4, 8, 16, and 32 respectively. This increase in the number of function calls in each subsequent recursion step results in exponential complexity. Exponential complexity can burden the processor and slow down the computer. Because each recursion step loads the variables of the function into memory, then multiple recursion steps load the memory with the many stack frames which consumes large amounts of memory. Likewise, if the recursive function contains an iterative statement in its body, then iteration occurs during each function thus the recursion ends up consuming significant processor and memory resources. This is the overhead of the recursion function.
  • Double selection statements are used in the body of the recursion function. This differentiates this function from the iteration function which is based on an iteration statement. Nonetheless, both functions achieve repetition through control structures which are the loop statement in the iteration function and recursion in the recursion function. As expected, both functions need a termination test to terminate their repetition and this is achieved through a base case in the recursive function, and the loop continuation condition in the iterative function. Failure to execute the terminate test results in an infinite program execution i.e infinite recursion for the recursive function and infinite iteration for the iterative function.
  • If the recursive function is called at the end of the main sub-program, then the resulting recursion is called tail recursion. It is usually the last statement executed in the procedure. Tail recursion improves software performance as the stack frames created by the recursive function are at the top of the stack and thus do not need to be persistent as compared to stack frames at the bottom of the stack. This is because its stack frames are the first to be popped out.
  • If the recursion is invoked at the start of the sub-program, then the recursion becomes a head recursion. Head recursion exacts significant resource usage as it creates stack frames that are located at the bottom of the stack, and thus these frames are the last to be popped off.

Factorial Calculator


#include <stdio.h>

int factorial (int value);

int value = 0;

int main (void)
{
    printf("Enter a value: ");
    scanf("%d", &value);
    for (int count = 1; count < value; count++)
    {
        printf("The factorial of %d is %llu.\n", count, factorial(count));
    }


    printf("The factorial of %d is %llu.\n", value, factorial(value));
}

int factorial (int value)
{
    if (value == 1)
    {
        return 1;
    }
    else
    {
        return value * factorial (value - 1);
    }
}

Fibonacci Value Calculator

/* Kagirison Fibonacci Calculator */

#include <stdio.h>

int fibonacci (int value);

int value = 1;

int main (void)
{
    printf ("Enter value: ");
    scanf("%d", &value);
  
    printf("The Fibonacci Value of %d is %llu.", value, fibonacci(value));
}

int fibonacci (int value)
{
    if (value == 0 || value == 1)
    {
        return value;
    }
    else
    {
        return fibonacci(value-1) + fibonacci(value-2);
    }
}

Leave a Reply

Discover more from Kagirison

Subscribe now to keep reading and get access to the full archive.

Continue reading