Jun
6
2012

Segmentation Fault correction with GNU Debugger

One of the reasons why I prefer Linux for programming is because of the GNU debugger. Any programmer at one point or the other would have to use a debugger so it’s always better we get started off sooner rather than later. In our previous stone “Understanding Segmentation Fault with simple C code snippets ” we digged out the various reasons for the annoying segmentation fault in our programs. Next up we now learn how to use the power of the GNU Debugger to fix our problems in no time at all.

Before continuing to read with this post I would recommend you to learn the basics of GNU Debugging. There are a lot of tutorials regarding this in the gdb help documentation or even in the internet. I would recommend http://betterexplained.com/articles/debugging-with-gdb/ to get you started. Now that you hopefully know the basics lets see how to correct segmentation faults with gdb.

A general way to counter the segmentation fault with the help of gdb is :

-> Run the program with the GNU Debugger ( run or r )

-> Do a backtrack to print all the stack frames ( backtrack or bt )

-> Enter the frame which caused the problem ( frame 0 )

-> Print the local variables to see the values( info local )

Not all the above steps might be required for debugging a segmentation fault. On the other hand not all the steps might be enough for debugging. We might need to add a few more steps or use a few more commands. However it lines the basic steps for most.

Below we outline a few C snippets which result in Segmentation fault and then try to detect and correct the error through the GNU Debugger ( gdb ).

1.

#include "stdio.h"
void main()
{
    char *p = "Fortystones";
    p[ 3 ] = 'Y';
    printf("%s",p);
}

As we can see the error is reported on line 5. Also when we print the value of p we are still getting “Fortystones” instead of “ForYystones” that we intended for. So we guess that probably we are not allowed to change the character in such a declaration.

Reason for the error : Trying to write to or modify the read – only portion of the memory


2.

#include "stdio.h"

int fact ( int num )
{
    return num * fact( num - 1 );
}

main()
{
    int n;
    printf("Enter the number\n");
    scanf( "%d" , &n );
    int ans = fact( n );
    printf("The factorial is %d " , ans );
}

In programs like this without a base case we simply get the same function called repeatedly as shown. Hence we conclude that its a recursion without a base case and add the base case of the recursion.

Reason for the error : Recursion applied without a base case causing Stack Overflow


3.

#include<stdio.h>
#include "string.h"
void main()
{
    int i;
    char arr1[ 15 ] = "Fortystones";
    char arr2[ 20 ];
    for ( i = 0; i < 20; i++ ) {
        arr2[ i ] = 'A';
    }
    strcpy( arr1 , arr2 );
    printf("%s",arr1);
}

As can be seen the error is generated by the function strcpy() with memory address Oxb7ede818. Now when we print the information of the registers we see that it is the address of the extended instruction pointer i.e it holds either the memory address of the instruction being executed, or the address of the next instruction to be executed. When we disassemble the memory address we see the dump of the assembler code for function strcpy(). We thus probably might have caused a buffer overflow through this strcpy() function.

Reason for the error : Buffer Overflow


4.

#include "stdio.h"
#include "malloc.h"
void main()
{
    char *a = "hey";
    free( a );
    a = "her";
    printf("%s",a);
}

As we can see the problem was reported on line number 6 where a has been freed. So we guess that we should not be freeing the memory at this point and correct the error. Also frame 1 reports free function call so its a lot easier to decide that we should not be calling the free function at this point.

Reason for the error : Dereferencing a pointer that has been freed.


5.

#include "stdio.h"
void main()
{
    int a = 0;
    int x = 5 / a;
    printf("%d",x);
}


Reason for the error : Division by zero

As we can see , gdb generates SIGFPE ( Signal Floating Point Exception ). We use the bt ( backtrace command ) to find the functions. As our program has only 1 function the problem is in frame 0. We enter the frame 0 by using the command frame 0 ( unnecessarily though ) as we are automatically in frame 0 whenever we use bt. Now using the command info locals prints the local variables in the scope of frame 0. As we can see we get a really big value of x ( arbitrary ) .Thus we conclude that the way we are computing the value of x is somehow erroneous. We also see that we are dividing 5 by a so the problem must be with a ( as 5 is a constant ). So now we check the value of a which is 0 which results in the error. Thus we conclude the cause of the reason to be division by a ( or division by zero ).

Share

Related Posts

About the Author: Raju Khanal

An IIIT student, pursing his B.Tech degree in the IT, is one of the other aspiring authors of fortystones. He is a passionate learner and a wanna-be coder. He loves Linux and always plays with it in his spare time.

Leave a comment

*