Solutions K&R 2nd Edition

https://img.shields.io/badge/language-C17-blue https://img.shields.io/badge/compiler-gcc-orange https://img.shields.io/netlify/d704dae8-20ba-4fe2-9dba-0c3564ef402e https://img.shields.io/github/license/Mr-Io/c-language-solutions

This is a comprehensive solution guide for the exercises of the classic book The C Programming Language - 2nd Edition, usually referred to as K&R.

Even though the last edition of K&R (2nd edition) was published in 1988, it is still condidered today, by many, the very best single book on C, both for learning, and for reference.

The authors Brian W. Kernighan and Dennis M. Ritchie not only presented C in an elegant and pragmatic style, but also designed a series of exercises on each chapter to help reinforce its knowledge.

The exercises of the book are solved staying true to comtemporary K&R C style while applying modern coding practices.

This repository is just a tribute to K&R.

C Code style

There are lots of C style guides, and usually, each project follow its own style, for example the linux kernel code style or the style guide for the official python implementation.

¿Which one to follow? Well, the most important rule regarding code style is: check the surrounding code and try to imitate it.

Following this principle, we will follow the style implicit in K&R, which is known as K&R style. However, in certain situations it became overly terse, so we will allow ourselves certain liberties and follow some modern “good practices”:

  • Prefer slightly more descriptive names than K&R.

  • Use braces even when C permits them to be omitted.

  • Avoid the use of typedef.

Unsafe Functions

There are some functions in the standard library that are vulnerable to buffer overflow attacks.

If you go to the man pages (as you should whenever you use a function) there is a warning if the particular function is unsafe (for example the first phrase in gets description section is literally “Never use this function” ) but there are others that may be vulnerable and have no warning whatsoever, for example the seemingly correct scanf("%s", s) is also vulnerable to buffer overflow and there is no warning on scanf man pages.

In the solutions, the following unsafe functions are forbidden:

  • gets, getwd

  • strcpy, strcat, sprintf

And the following functions must specify a maximun field width when formatting string input:

  • scanf, fscanf, vscanf, vfscanf

For example:

char s[MAXSIZE];

scanf("%*s", MAXSIZE-1, s);

Note

Modern compiler and OS implement techniques to prevent buffer overflow attacks:

  • Stack randomization (although brute force is still possible)

  • Limiting executable code regions (although return oriented attacks may work)

  • Smash stack detection (although it is implemented only for the functions that the compiler consider, for example gcc 11.3.0 use it for scanf but not for gets)

Even if buffer overflow attacks were impossible to mount (it is not) it is still a bug, so its better to just avoid it.

Compiler, C version and OS

Currently, there are many ISO versions of C, in this repository the code is compiled using C18 with GNU-specific extensions (-std=gnu18), which is the default in gcc. Anyway, the code will compile just fine in any C ISO and, my recommendation is just to use the C version that your compiler uses by default (which is generally the newest one).

Regarding compilers, the 2 most popular ones for C are GCC and Clang. Both of them supports a huge number of extensions and features over the language standards. GCC is the default compiler for the linux kernel and is even considered the defacto-standard open source compiler. Both of them would be fine but the one used in this repository is GCC. This is important because in some exercises of chapter 8 we use some specific features of it.

GCC is shipped with almost every Linux distribution, it can be easily installed in macOS and its also supported in Windows (with MinGW).

The target OS also became important in chapter 8, because system calls are used in the solutions. In this repository, Linux is used to run the code, but any Unix-like OS like macOS will do just fine. Windows is not a Unix-like OS but it can support Linux natively with WSL.

C Code optimization

The most important thing of a program is to work correctly, then to be readable and modular (DRY - Dont Repeat Yourself, easy maintenance, etc… ) and lastly to be efficient and fast.

We resume optimization in 4 levels:
  • Level 0: Use appropiate data structures and algorithms.

  • Level 1: Exploit hardware parallelism (multiproccesing, multithreading, etc..).

  • Level 2: Coding principles so that a compiler can generate efficient code.

  • Level 3: Exploit processor microarchitecture to maximize performance on the target machine.

Level 0 optimization is a must for every program, even to the point that we can consider some program to only work correctly when the appropiate data structures and algorithms are used. In K&R, computer science knowledge is somewhat assumed. We dont need to be experts of the subject though, it can be learned and/or researched while reading and doing the exercises. CLRS is an excellent reference book for that. The solutions use appropiate data structures and algorithms.

Level 1 optimization is not relevant for this repository.

Level 2 optimization involve code principles so that the compiler can generate efficient code. The solutions follow these 3 coding practices:

  • Minimize function calls (and any computation) in loops

    For example instead of:

    for (i = 0 ; i < strlen(s) ; ++i){
        s[i] = somefunc() + somefunc() + somefunc();
    }
    

    use auxiliary local variables to do this:

    int len = strlen(s);
    for (i = 0 ; i < len ; ++i){
        s[i] = 3*somefunc();
    }
    
  • Minimize pointer derefencing (including subscript operator) in loops

    For example instead of:

    *res = 0;
    for (i = 0 ; i < strlen(s) ; ++i){
        *res += s[i];
    }
    

    use auxiliary local variables to do this:

    int tmp = 0;
    for (i = 0 ; i < len ; ++i){
        tmp += s[i];
    }
    *res += tmp;
    
  • Group struct members of the same type together

    struct st {
        int a;
        int b;
        int c;
        char d;
        char e;
        double f;
        double g;
    };
    

    so that padding is minimized.

Until this point, readability of the solutions has not been compromised. However level 3 optimization involves techniques like: loop unrolling, vector instruction optimizations, usage of functional style when branch prediction is not possible, minimize cache misses with locality… that; 1. make the optimization not portable (only maximized for the target machine) and 2. make the code order of magnitude more complex and less readable. Therefore, only in very specific applications are used. The solutions do not follow level 3 optimization techniques.

How to contribute

One of the repository goals is to be welcoming to beginners and to people starting to learn C and GitHub. If it is your first time consider following a guide to make your first contribution.

This repository has only the main branch and you are very welcome to contribute to it; just follow the simple fork and pull request method.

Here is a list of topics that are perfect to start with:
  • Improve solutions code:

    • bugs

    • style

    • error handling

    • avoid unsafe standard functions (due to overflow)

    • check integer arithmetics overflow

    • check floating arithmetics underflow/overflow

    • check floating rounding errors

    • refactor

    • optimize using coding principles

    • etc…

Changes or improvements unrelated to this are also welcome and may be done with a pull request.

Once your pull request is merged, your contributions will be publicly visible on the GitHub repo and on the web page, wich is automatically updated on each commit.