Skip to content
Rushikesh
Go back

Garbage Collection 101: Understanding Memory Management

Memory management is one of the most critical aspects of programming, and garbage collection has revolutionized how developers handle memory allocation and deallocation. In this comprehensive guide, we’ll explore the fundamentals of garbage collection, examine popular algorithms, and understand why some languages choose manual memory management over automatic collection.

What is Garbage Collection (GC)?

Garbage Collection is a form of automatic memory management. Programs allocate memory dynamically (e.g., for objects or data structures), and GC finds and frees memory that’s no longer in use, preventing memory leaks and crashes.

Can’t We Do Garbage Collection Manually?

Yes, we absolutely can! However, garbage collection was introduced so that developers can spend more time on core logic rather than worrying about freeing memory. Manual memory management requires careful tracking of object lifetimes and explicit deallocation, which can be error-prone and time-consuming.

Languages Without GC

Some languages that don’t have garbage collection include:

Why Don’t These Languages Have GC?

  1. Performance Control: Manual memory management gives precise control over when and how memory is allocated and freed (C/C++)
  2. Ownership Model: Rust replaces GC with a borrow checker that ensures memory safety at compile time
  3. Simplicity: Avoiding GC complexity in low-level software and embedded systems

Famous Garbage Collection Algorithms

Here are some of the most well-known GC algorithms:

Let’s dive into each of these algorithms.

Reference Counting

Reference counting is as simple as a waitgroup in Go. Whenever a new object is created or referenced, you increment a global counter. When it’s freed or set to null, you decrement the counter.

Advantages:

Disadvantages:

Mark-and-Sweep: The Classic Algorithm

Mark-and-Sweep is the most famous algorithm for garbage collection. It operates in two distinct phases:

Two Phases:

  1. Mark: Start from root objects (stack/global variables) and mark every reachable object using a basic DFS algorithm
  2. Sweep: Traverse the heap and free all unmarked objects

Detailed Working of Mark-and-Sweep

Consider this simple struct:

struct Node {
    struct Node* left;
    struct Node* right;
};

Suppose we have only one root Node* root.

The GC process works as follows:

  1. Visit the root and mark it as reachable
  2. Recursively mark children (left, right pointers)
  3. After marking phase completes, sweep through the heap and free all unmarked objects

Advantages:

Disadvantages:

Tricolor Marking Algorithm

Tricolor marking is an optimization over mark-and-sweep designed to address the stop-the-world issue. It enables concurrent garbage collection while the program continues running.

Object Categorization:

Objects are categorized into three colors:

The GC maintains this coloring system to incrementally collect garbage while the program runs.

How Tricolor Marking Works:

1. Start with all root objects marked Grey
2. While Grey set is not empty:
   a. Pick a Grey object
   b. Move it to Black (mark as processed)
   c. For every object it references:
      - If the referenced object is White, mark it Grey
3. When done, all unreachable objects are still White (garbage)
4. Collect all White objects

Advantages:

Green Tea Algorithm: The Future of Go

Green Tea GC is still being explored and will be part of newer Go versions. This algorithm promises to be fast, concurrent, and low-pause, representing the next evolution in garbage collection technology.

You can read more about Green Tea GC in the official Go proposal.

Summary

Understanding these different approaches to memory management helps you make informed decisions about language choice and system design based on your specific requirements.

References

My implementation of Tricolor GC in Go

Green Tea GC Proposal


Share this post on:

Previous Post
Database Isolation Levels: Performance Impact
Next Post
Adding Columns to SQL Table: The Hidden Performance Impact