SE250:HTTB:WIP:Linked Lists

From Marks Wiki
Jump to navigation Jump to search

<html><embed src='http://img372.imageshack.us/img372/4655/mynewbanner2uq4.swf' quality='high' pluginspage='http://www.macromedia.com/go/getflashplayer' type='application/x-shockwave-flash' width='640' height='86'></embed></html>

Back to Work in Progess

Basic Information

Introduction

A linked list is a data structure that stores data as separate elements in memory, called cons cells. A cons cell has two elements - a piece of data and a pointer. The pointer points to the address of the next cons cell. This means when a cons cell is added or deleted there is no need to change the location of the cons cells in memory. All that is required is to change where the adjacent pointers point.

This course only deals with single linked lists, ie. they only point to the next element. Linked lists do exist that point to the both the next cons cell and the previous cons cells.

Real Life Example: An Absolute n00b's Guide

<html>

<embed src="http://www.orlybird.com/SE250/HTTB/linked_lists.swf" height="400" width="500" />

</html>

Created By: Sshi080 18:18, 6 June 2008 (NZST)

Simple Examples in Code

Pseudocode

Structure Node {
    data /* The data being stored in this node */
    next /* A reference (pointer) to the next node or null if this is the last node */
}

C Struct

typedef struct node
{
        int data; /* Storing int data in this case, can be other data-types. */
        struct node *next; /* pointer to next element in list */
} node;

Visual Diagrams

A Single Cons Cell <html><img src="http://i298.photobucket.com/albums/mm247/Phoenix100_album/HTTB/SimpleConsCell.jpg" height=120 width=280></html>

A Linked List

A linked list comprises of one or more cons cell linked together via pointers. <html><img src="http://i298.photobucket.com/albums/mm247/Phoenix100_album/HTTB/LinkedList.jpg"</html>


A Nil Cons Cell

A nil cons cell is a special type of cons cell that has a pointer to itself. It is always the last cons cell in a linked list. <html> <img src="http://s298.photobucket.com/albums/mm247/Phoenix100_album/HTTB/NillConsCell.jpg" height=120 width=180> </html>

Advantages of a nil cons cell

  • nil is a real object
  • you can refer to both the head and tail nil
  • when writing code you don't have to test for nil.

This code:

static Cons NIL = {0,NIL};
Cons* nil = &NIL;

Produces this situation: <html><img src="http://img56.imageshack.us/img56/7899/nilpointer2km6.jpg"></html>



NIL is the terminating Node for a linked list, and nil is a pointer to that node, which is stored in the last true node of a linked list. nil is deliberately placed to be tested for, which is why the simple condition: while (list->tail != nil) is used. The nil pointer, which holds the address of NIL, provides an effective catch situation for a linked list when the user has not checked for nil - the NIL node is accessed, and it simply calls itself repeatedly. This is probably(?) preferable to corrupting a segment of memory unintentionally. Using a nil pointer also eliminates the need to check for a NULL pointer at the end of a list. (I think) The NIL is referred to as a sentinal (or dummy) node, and is a fairly common implementation amongst linked lists.

Introductory Screencast

Below: A screencast which examines some fundamental code for the singularly linked list used in Lab-4. <html> <embed src="http://www.resisty.com/linkedLists.swf" height="431" width="639"/> </html> Bvic005 22:43, 7 April 2008 (NZST)


It's important to note how the Linked-list is stored in memory. The add() function above uses a call to malloc(), reserving Cons cell sized blocks within the "heap". memory allocated in this fashion is must be dealt with in (if the programmer chooses) a deleting or node removal function. A call to free() will realease this memory to be overwritten in successive calls to malloc(), and so limit (or stop altogether) memory leaks - where memory is not freed and so gradually fills the heap unnecessarily until there is no more left to allocate - perhaps causing errors or (hopefully) crashing the program.

Pros and Cons of Linked Lists

Pros

  • Existing elements in the list do not have to be moved when a new element needs to be inserted into the front or the middle of the list.
  • New elements can be inserted into the list indefinately as long as there is memory left.
  • A Linked list only requires enough memory to store its data (plus a slight overhead due to other fields within its struct).
  • Provide the means for Hash tables to handle collisions, performance of a LL's don't suffer too greatly when small in length.

Cons

  • Every element requires an explicit pointer.
  • Linked lists can only be transvered in one direction, meaning it is not as useful when random elements want to be accessed. Also, they can only be transvered sequentially.*
  • Linked lists jump around in memory, which can be more time consuming than say, an array based list, where memory is stored right next to each other.


(*)This means the list does not allow random access - the list must be traversed from the beginning to find the required node (taking somewhere around O(n) time). Linked lists have sacrificed this ability (which arraylists have) for faster (O(1)) insertions and removals.


Link List Functions

Calculating the Length of Linked List

int length ( Cons* list ) {
    int l = 0;
    for ( ; list != nil; list = list->tail )
        l++
    return l;
}

<html><embed width="448" height="361" type="application/x-shockwave-flash" wmode="transparent" src="http://i298.photobucket.com/player.swf?file=http://vid298.photobucket.com/albums/mm247/Phoenix100_album/HTTB/Length_code.flv"></html>

Returning the nth element

element_t nth ( int i, Cons* list ) {
    while (i-- > 0 )
        list = list->tail;
    return list->head;
}

<html><embed width="448" height="361" type="application/x-shockwave-flash" wmode="transparent" src="http://i298.photobucket.com/player.swf?file=http://vid298.photobucket.com/albums/mm247/Phoenix100_album/HTTB/n.flv"></html>

Checking if two linked lists are equal

int equal ( Cons* listA, Cons* listB ) {
    while ( listA != nil  &&  listB != nil ) {
        if ( listA->head != listB->head ) 
            return 0;
        listA = listA->tail; 
        listB = listB->tail;
    }
    return listA == listB;
}

<html><embed width="448" height="361" type="application/x-shockwave-flash" wmode="transparent" src="http://i298.photobucket.com/player.swf?file=http://vid298.photobucket.com/albums/mm247/Phoenix100_album/HTTB/Equal.flv"></html>

Finding an element in the list

Cons* find( element_t elt, Cons* list ) {
  for( ; list != nil; list = list->tail )
    if( list->head == elt )
      break;
  return list;
}

<html><embed width="448" height="361" type="application/x-shockwave-flash" wmode="transparent" src="http://i298.photobucket.com/player.swf?file=http://vid298.photobucket.com/albums/mm247/Phoenix100_album/HTTB/Finding.flv"></html>

Appending one cons cell to another

Cons* append( Cons* xs, Cons* ys ) {
    Cons* result;
    Cons** cpp = &result;
    for ( ; xs = !nil; xs = xs->tail ) {
        *cpp = cons( xs->head, ( Cons* )0 );
        cpp = &( *cpp )->tail;
    }
    *cpp = ys;
    return result;
}

<html><embed width="448" height="361" type="application/x-shockwave-flash" wmode="transparent" src="http://i298.photobucket.com/player.swf?file=http://vid298.photobucket.com/albums/mm247/Phoenix100_album/HTTB/Append.flv"></html>


An attempt at an explanation

  • Cons* result (a pointer to a cons cell, in this case the first node of the resulting list).
  • Cons** cpp (a pointer, which holds the address of another pointer, which points to a Cons cell). It can be visualised like this:
Part1 Part2 Part3
cpp(pointer to pointer to Cons)--> pointer to Cons--> Cons cell (in malloc’d memory)

Being initialised with the address of result means when the loop has ended and cpp no longer points to the beginning of the list we want to return to the user, result can be given instead.

Loop:

for ( ; xs = !nil; xs = xs->tail )

No initialisation, condition: xs must not hold the address of nil, increment: xs = xs->tail at the end of each loop. This means the loop traverses the linked list xs.

*cpp = cons( xs->head, ( Cons* )0 )

Part 2 and 3 of cpp are altered. A cell is created with malloc() for part 3 and the pointer to this memory is stored within cpp at part 2. This will be the first cell in the new list. The interesting point here is that the tail field of the new cell, which should hold the address of the next cell, is (Cons*)0, that is, the value 0 cast as a pointer to Cons. We’ll come back to this.

cpp = &( *cpp )->tail

cpp (part 1) is given the address stored in the tail field of part 3 (0), overwriting its current value, the address of result.

Second loop:

Following the increment of xs = xs->, a call to xs->head will give the head value of the second cell.

Right, now because cpp (part 1) was given the address in the tail field of the first new cell in the previous loop, the command *cpp = cons(xshead,(Cons*)0) will modify this value to hold the address of the new Cons cell made in this loop, meaning the original (Cons*)0 is overwritten. This will continue until xs is completely copied into a new list.

Finally:

Part 1 of cpp will end the loop holding the address within the tail field of the last new cell. Using the command *cpp = ys, this is overwritten with the address of the first cell in the linked list ys. Result, now pointing to the beginning of this new list, is returned to the caller.

Reversing the order of a linked list

Cons* nreverse( Cons* list ) {
  Cons* prev = nil;
  Cons* next;
  for( ; list != nil; list = next ) {
    next = list->tail;
    list->tail = prev;
    prev = list;
  }
  return prev;
}

<html><embed width="448" height="361" type="application/x-shockwave-flash" wmode="transparent" src="http://i303.photobucket.com/player.swf?file=http://vid303.photobucket.com/albums/nn134/duvduvduv/screencastwmv.flv"></html>

This function takes a pointer to the beginning of a Linked List (list) and returns a new pointer (prev) to the same list, only the address stored in prev belongs to what used to be the final cons cell in list. Starting from the first step:

  1. prev is initialised with the value nil, next is only left declared.
  2. There is no initialisation step in the for loop (;), the condition is checked first, then the loop begins.
  3. next is given the address of the second cons cell in list to hold.
  4. The address stored in list->tail (that of the second cell) is overridden with the address stored in prev, (nil).
  5. The address list is holding (a pointer to the first cell, whose tail field now holds the address nil) is given to prev .

This just happened. Within list, next was given the address of the next cell in list. List's first element had its tail field changed to nil (so it will become the end of the list), instead of holding the address of the second element. prev then pointed to that 'last' cell, and the loop was ready to begin again, with the next cell waiting to be modified in the next field. As the address stored in prev increased towards the end of list, the malloc’d memory within each list cell had its the tail field modified, so where it once held the address of the next cell, it then held the address of the previous cell. Nreverse.

Itterating Through Linked Lists

Not sure who's this was supposed to be but since it's empty, I'm filling it in
Rbha033 19:14, 15 June 2008 (NZST)

  • This function iterates through the list without crashing the system. On finishing it returns i, the number of cells in the list, to show that the system did not crash before the iteration was complete ie iteration was a success. If list was empty, it returns 0. If unsuccessful, it doesn't return anything as it just keeps going ( NIL cell points to itself ).


  • Code:
 int iterate( Cons* list) {
 
     int i=0;
     while ( list->tail != &NIL ){
           list = list->tail;
           i++;
     }
     return i;
}

Rbha033 19:14, 15 June 2008 (NZST)

Removing the Nth Element from a Linked List

<html><img src="http://www.geocities.com/racbhat13/removell.jpg"></html>

Here's the psuedocode implementation of how you would delete the nth element from a Linked List.

  • Iterate through the list to find the element n-1.
  • To free the memory off element n, store n into temp, i.e. temp = (n-1)->tail.
  • (n-1)->tail = (n+1). So (n-1)->tail = (n-1)->tail->tail.
  • Now free the memory. Free(temp);
  • return linked list

Here's the code:

//This function is passed a Linked List "list" and an int "n" signifying the n'th 
//element to be deleted.
Cons* remove(int n, Cons* list){
 int i; //for the iteration.
 Cons* temp; //The pointer for later storing the n'th element to free the momory.
 for(i=1; i<n ; i++){
   list=list->tail;  //if i<n then move to the next cons cell.
  }
 temp = list->tail; // Storing n'th (n-1+1) cell into temp to free memory.
 list->tail=list->tail->tail; // (n-1)->tail  = (n+1).
 free(temp); // free element n.
return list; // return the new list.
}


Rbha033 14:03, 10 June 2008 (NZST)

Adding an element at position N

Psuedocode:

  • Check list is not empty and n is not less than 0.
  • If n = 0, then dereference the new cell's tail to point to the beginning of the list
  • Else, Iterate to position n-1 and take its tail and make that new cell's tail, and add new cell's address to n-1's tail.


Code:

Cons* add_at_n(Cons* list, int n, element_t elt){

Cons* newList = list;
Cons* tempTail;
int i;

 if ( (list == nil) || (n < 0) ){
    printf (" Error ");
    return list;

 } else if (n == 0) {
     elt->tail = &list;
     return elt;

 } else {
     for (i = 1 ; i < n ; i++) {
          list = list->tail;
     }
     
     tail = list->tail;
     list->tail = &elt;
     list->tail->tail = tail;
     return newList;
 }  
 

Rbha033 14:03, 10 June 2008 (NZST)


Splitting a Linked List at element N (Resizing)

This function takes a LL, list, and breaks a link at position n to make the LL smaller. On top of that it also returns the rest of the elements of the LL in a different LL called list_b.

Code:

Cons* splitLL ( Cons* list, int n ) {

    Cons* list_a = list;
    Cons* list_b;
    int i;

    if (list == nil) || (n<=0) {
        printf ("Error");
        return list;

    } else {
        for (i = 1; i < n; i++) {
             list = list->tail;
        }
        list_b = list->tail;
        list->tail = nil;
        list = list_a;
        return list_b;
    }
}

Rbha033 14:59, 10 June 2008 (NZST)

Doubly linked lists

<html><img src="http://img519.imageshack.us/img519/712/doublylistol1.gif"></html>


Doubly linked lists are a type of structure that allows you to use backward pointers as well as traditional forward pointers. This feature helps when comparing things in the whole list while you are part way through the list, this is possible as there are links from each node and point to both directions, forward and backward. To implement it follow the code below:

struct node
 { string name;
   node *next;    // Pointer to next node
   node *prev;    // Pointer to previous node
 };

Where *next is a pointer to the next element in the list and *prev points to previous element in the list. As for the comparison of Doubly linked lists and a Singly-linked list The Doubly linked list has a definitive advantage as data can be easily read and analyse backward and forward pointers, but the down point is that they take more thinking to compile and construct. There is now also a greater overhead for each cell - 4 bytes (or more) for the extra pointer field. While the shorter traversal time required for the 2-way pointers gives a performance increase, this comes as a trade-off between memory usage.

Screencast directory for LinkedLists

Table with screencast descriptions shown below:

User Description
hlin079 The difference between Array Based List and Linked List
vpup001 Explanation of functions that return element in 1st, 2nd, 3rd and 4th position
sdal039 Iterating through a linked list
rthu009 Creating a simple linked list (step by step guide)