Saturday, May 28, 2011

How Well Do You Know C?

Think you have mastered the ins and outs of the C language? Test your knowledge of the nuances of C by trying to determine what is printed by each of the following programs. All code is C99 and exhibits well-defined behavior. Answers and discussion follow.


Questions

1 - Comments

#include <stdio.h>

int main(void) {
    printf("%d", 1 // comment \
        + 2
        + 3);
    printf("%d", 1 //**/ 2
        + 3);
    printf("%d", 1 /**// 2
        + 3);
    puts("");
    return 0;
}

2 - Proprocessor

#include <stdio.h>

#define FOO1(x) #x
#define FOO2(x) FOO1(#x)
#define FOO3(x) FOO2(#x)

int main(void) {
    puts(FOO1(bar));
    puts(FOO2(bar));
    puts(FOO3(bar));
    return 0;
}

3 - Typedefs

/* Assume 4-byte ints */
#include <stdio.h>
typedef int foo;
typedef char bar;

int main(void) {
    foo a = sizeof(bar), bar = sizeof(foo), foo = sizeof(bar);
    printf("%d%d%d\n", a, foo, bar);
    return 0;
}

4 - Initialization

#include <stdio.h>

struct foo {
    int a;
    int b;
};

int main(void) {
    struct foo bar[5] = { 1, 2, 3, 4, 5, 6, [1] = { .b = 10 }, { .a = 11 } };
    for (int i = 0; i < sizeof(bar) / sizeof *bar; i++) {
        printf("{%d, %d}\n", bar[i].a, bar[i].b);
    }
}

5 - Scope and Linkage

The following two files are compiled together:
int i = 10;
int j = 30;
#include <stdio.h>

extern int i;
static int j = 40;

int main(void) {
    int i = 20;
    {
        extern int i;
        extern int j;
        i++;
        j++;
        printf("%d,%d\n", i, j);
    }
    i++;
    j++;
    printf("%d,%d\n", i, j);
    return 0;
}

Answers and Discussion

The answers to each question are provided below along with discussion of the significant points involved with each question. All references refer to the C99 Standard (9899:1999) unless otherwise noted.

1 - Comments


Answer:

443

The important things to realize here are:
  • The backslash that proceeds a newline is replaced before comments are processed (§5.1.1.2) so a line that ends in a backslash causes a C++-style comment to be extended to the next line (§6.4.9p2-3).
  • The second / in a // comment starts the comment cannot also be used for another purpose such as starting a C-style comment.
  • The / that ends a C-style comment cannot also be used as a character for another token.

2 - Proprocessor


Answer:

bar
"bar"
"\"bar\""


The # preprocessor operator replaces the immediately following macro parameter with a string literal whose contents are the characters present in the argument (§6.10.3.2p2).
FOO1(bar) becomes "bar" which when passed to puts prints bar

When the string to be replaced contains quotes, a \ is inserted before each quote in the replacement so:
FOO2(bar) becomes FOO1("bar") which in turn becomes "\"bar\"" which is printed as "bar".

When the string to be replaced contains backslash, a \ is inserted before each backslash in the replacement so:
FOO3(bar) becomes FOO2("bar") which becomes FOO1("\"bar\"") which becomes "\"\\\"bar\\\"\"" which prints "\"bar\"" when passed to puts.

3 - Typedefs


Answer:

144

Typedefs names exist in the same namespace as "ordinary identifiers" (§6.2.3).

The declarator:

foo a = sizeof(bar), bar = sizeof(foo), foo = sizeof(bar);

defines 3 variables of type foo, each initialized to the size of another object or type.

In the initialization of a, bar refers to the typedef name which has a size of 1 (sizeof(char) is defined as 1).
In the initialization of bar, foo refers to the typedef name which has a size of 4 (as given).
In the initialization of the variable foo, whose name now masks the typedef name, bar refers to the variable bar whose scope started at it's declaration and hides the typedef of the same name. The size of the variable bar is 4.

4 - Initialization


Answer:

{1, 2}
{0, 10}
{11, 0}
{0, 0}
{0, 0}


struct foo bar[5] = { 1, 2, 3, 4, 5, 6, [1] = { .b = 10 }, { .a = 11 } };
is equivalent to:
struct foo bar[5] = { {1, 2}, {3, 4}, {5, 6}, [1] = { .b = 10 }, { .a = 11 } };
The first part:
struct foo bar[5] = { {1, 2}, {3, 4}, {5, 6} ...
is pretty straight-forward, it initializes the first 3 structs in the array.

The next part:
[1] = { ... }
is the C99 "designated initializer" syntax to initialize a specific element of the array, in this case the second element.

If the element has already been initialized earlier in the initializer, the later initialization overwrites the previous one (§6.7.8p19), so the effect of:
[1] = { .b = 10 }
is to overwrite the second element of the array with the value specified in the brackets which is another designated initializer, this time the form used for structures. In this case, only one member of the structure has an initializer so the other member is initialized to zero (§6.7.8p19).

Initializers for an array occur in "increasing subscript order" and the subobject specified after a designated initializer is the next one in the array (§6.7.8p17) which causes another overwrite, time time of element 2. Again, since ony one member of the structure is specified in the initializer, the other element is initialized to zero.

Lastly, since only 3 of the 5 array elements are specified in the initializer, the remaining elements are implicitly initialized (§6.7.8p21).

The clang compiler (2.9 and trunk) gets this wrong producing

{1, 2}
{3, 10}
{11, 6}
{0, 0}
{0, 0}


by overwriting only the explicitly specified member in the overriding initializer. GCC and Intel produce the correct result.

5 - Scope and Linkage


Answer:

11,41
21,42


int i = 10;
int j = 30;
The variables i and j have external linkage, the default for file scope objects (§6.2.2p4).
extern int i;
The extern declaration in File 2 associates the identifier i with the same object defined in File 1.
static int j = 40;
The static declaration gives j internal linkage and it is not associated with the variable j defined in File 1 which is not accessible from this file.
int main(void) {
    int i = 20;    
This declaration hides the file scope i variable with a block scope version.
     {
        extern int i;  
The behavior of an extern declaration in block scope when there is an identifier of the same in scope depends on the storage class of the visible identifier (§6.2.2p4). In this case, the identifier in scope is the one declared in the enclosing main function. This identifier has no linkage so the rules dictate that the linkage of the newly declared identifier is external and it refers to the i in File 1.
        extern int j;
In this case, there is also a declaration for the same variable in scope but its linkage is internal so this one is too and refers to the same object declared at file scope.
        i++;
This increments the i defined in File 1.
        j++;
This increments the file scope j variable.
        printf("%d,%d\n", i, j);
And this prints them.
    }
    i++;
At this point, the scope of the extern declaration in the above block has expired and the variable incremented is the one defined inside the function. Some compilers extend the scope of a nested declaration to the end of the function or file but that behavior is not standard compliant.
    j++;
The only j object defined is incremented.
    printf("%d,%d\n", i, j);
And the same variables are printed.

1 comment:

  1. Good. I didn't know some of these things. But then, there are things that we don't really need to know.

    ReplyDelete