Before giving any formal definition of Recursion, I will like to discuss a scenario…

Browing your files on local disk is a very common task. Let us assume you have to write a procedure that performs an action (like calculating the size) on every file on your C:
The algorithm will be something like follows:

Apply the action on all files in the root directory. Then pick up its sub-diectory & apply the action on all its files. Then pick up the sub-sub directory & continue the procedure until there’s no more nested sub-directories.
The next step is to backtrack to the parent directory & perform the entire action again on another sub-directory… ( I hope u understand what’s going on)
The entire procedure goes on until all of the files have been processed.

This is the ideal scenario where Recursion can kick in.

The entire concept of recursion is as simple as this:
Recursion can be applied to any problem that can be divided into sub-problems that are similar in nature to the original problem.

Coming back to the above problem, the original problem was to Browse the root directory.
It could be divided into sub-problems as performing the action on root directory files plus Browsing the sub-directory files…

This is Recursion.

Let’s take another scenario. Lets assume you have to calculate the factorial of a number (say 5).
The technical definition of a factorial is (remember factorials apply only to integers >=0):
fact(n) = n*fact(n-1);  n>0
            = 1                 ; n=0

This again is a recursive definition. The original problem is divided into sub-problem (n*fact(n-1)) that is exactly similar to the original problem….

That’s Recursion in its simplest terms…

 

Recursive Conditions

The last example of Recursion brought us to some very important conclusions. To implement any recursive definition, it should satisfy some conditions.

In a nut-shell, any such definition should invariably satisfy the following three conditions:

1. A recursive function should contain a call to itself (thus, the task should be divisible into sub-tasks, similar in nature to the original one).
2. There must be a condition which at some point of time becomes true, after which no further recursive calls are made.
3. Every recursive call should take us a step closer to satisfying that condition.

In the case of factorial definition, the three conditions are satisfied as follws:
1. Calculating factorial of a number first requires calculating factorial of a number smaller than it.
2. The condition for no further recursive calls is
       fact(n) = 1; n=0

So 5! = 5 * 4!
         = 5 * 4 * 3!
         = 5 * 4 * 3 * 2!
         = 5 * 4 * 3 * 2 * 1!
         = 5 * 4 * 3 * 2 * 1 * 0!
         = 5 * 4 * 3 * 2 * 1 * 1

Notice that in the last step, when n=0, instead of making further calls to itself, the procedure ends returning a value & making no further calls.

3. The third condition is obvious, after every call, the number is decremented… which ultimately takes us to the condition when n becomes 0 (satisfying it so that no further calls are made).

The third point above also gives an idea, why factorials of negative numbers don’t exist. They can be decremented to infinity in each step, & so the condition of n becoming 0 will never be satisfied.

 

Detailed Example

Now, lets move from the theoretical to the practical aspects of Recursion.

Lets define a recursive function fact() to calculate factorial of a given positive number.
The function would be as follows:

{syntaxhighlighter brush: cpp;fontsize: 100; first-line: 1; }int fact(int n)
{
if (n==0)
return(1);
else
return(n*fact(n-1));
}{/syntaxhighlighter}

The above function implements the already discussed recursive definition of factorial.

This brings us to another aspect. You may ask, why should I use Recursion when the same task could be performed with a simple loop…

fact=1;
for(i=0;i<n;i++)
  fact=fact*i;

Agreed….
In theory, every problem that is defined recursively can be implemented using loops.
This means that there is always a work around to Recursion.

But there are situations where a recursive procedure makes more sense, and is easy to implement.
I will gve two examples to support this statement:

1. First consider the above problem of Browing the directory structure.
Its not hard to imagine the complexity had the problem been solved with loops.
Implementing it with recursion is as easy as follows:

(The listing is given in VB for understanding purposes as collections in VB make Browing folders easier. Anyone having a problem with this code can request a c equivalent of the same)

 

Sub scanfolder(thisFolder as Folder)
  Dim allFolders as Folders
 
  Set allFolders=thisFolder.SubFolders 'Extract all sub folders

  For Each Folder in allFolders
  ' Apply the process here
scanfolder(thisFolder.path)
  Next
End Sub

2. We all have studied ‘Trees’. These data structures play a very crucial role in any data management.
I assume u all have studied the ‘Tree Traversal’ algorithms. (If u dont, u can ask me to reproduce them here).
If u recall, they are implemented via stack if we dont use recursion & are quite complex at the first instance (especially the Post-Order traversal).

So, here’s the Recursive implementation of Post-Order tree traversal in C.

{syntaxhighlighter brush: cpp;fontsize: 100; first-line: 1; }void PostOrder (Node xx)
{
if (xx = NULL)
return;

if (xx->left != NULL)
PostOrder(xx->left);

if (xx->right !=NULL)
PostOrder(xx->right);

‘Apply Process u want to node xx here
}{/syntaxhighlighter}

 

Recursion – Finer Points

And now, I will discuss some finer points about Recursion.

First, a recursive definition is always less efficient than when the same procedure is implemented with loops. This is because the overhead related with the function calling itself again & again.

Second, it’s very easy to goof-up recursive implementations. That’s primarily because one of the above three points a recursive definition should satify are violated.
The second & third points are very important in this context.
Remember to include a condition that terminates further recursive calls. Also see, that each recursive call takes a step further to satisfy that condition.
Violating any of these would result in any infinite loop, ultimately causing a stack overflow (i believe everyone of us has encountered this error, when developing our first recursive procedure ).

Just for a quick recall, I would reiterate how the above function satisfied all the three conditions for Recursion…

1. Clearly, it calls itself in its body… So its recursive.
2. The condition to terminate further calls is:
‘When no more sub-folders of current folder exist, dont make further calls & return’.
As u can see, this statement is implemented via the for loop in the above code.

3. All of us know, that folders do not have sub-folders endlessly. At some point, a folder will have no further sub-folder. So, in each step, we are coming close to that situation. This is where there will be no further calls….

 

Closing Remarks

Some final points from my side…

The above example underlines an advantage of Recursion. Suppose u had to implement it via loop without Recursion.

The major problem u would have had was to handle the allFolders collection.
In principle, u would have two problems.

1. You never not know how many levels of nesting folders, a client’s computer have. So, u would have problem deciding how many times to run the loop, & also how to keep the track of nesting. This would leave u no choice, but dynamic memory management for the allFolders collection, which we all know can get quite messy.

With Recursion, this is not a problem, as u dont have to decide either loop count or the level of nesting. Everytime a new recursive call is made, a new collection alFolders is created. So a new data structure is created on every call & there’s no need of dynamic memory management.

This point would be further clear, if u consider the example of tree traversal presented earlier.
Without recursion, u would have to use a stack, which had to be resized dynamically.
With Recursion, there’s no such thing (Internally, all recursive calls occur on memory stack u know!!!)

2. You would have problem in backtracking. With Recrsion, backtracking occurs automatically.

I am making a couple of further points, u can safely ignore if u r not that well accustomed to Recursion.
You may ask, how do i pass information between successive recursive calls.
Simple, either use parametres…
or for another approach, u can declare Static variables in the function & manipulate it accordingly to pas* information back & forth between the recursive calls…

As i discussed, every recursive definition can be implemented non-recursively.
Remember, u would always have to use a stack anytime u do so…