iTranslated by AI
The C Language You Didn't Know
This article is for Day 21 of TNCT23s 3J Advent Calendar 2025.
Everyone, are you writing C?
In fact, the C language has many unknown syntaxes.
In this article, I will introduce some of the unusual syntaxes of C.
Syntax sugar for []
This might be relatively well-known.
When accessing an array,
int a[] = {0, 1, 2};
int index = 1;
a[index];
you probably write it like this.
You likely also know that a[index] is equivalent to *(a + index).
So, what happens if we use this syntax in reverse?
index[a];
It looks clearly weird to try to access the a-th element of index, but this would be equivalent to *(index + a). Furthermore, this is equal to *(a + index).
?
Surprisingly, index[a] results in the same value as a[index].
Comma operator
When you want to handle two variables in a for loop,
for (int i = 0, j = 10; i < 5; i++, j++) {}
did you know you can write it like this?
What exactly is this comma in i++, j++?
Actually, this comma is a type of operator just like + or *, and it can be used even outside of for loops.
The function of the comma operator is to "evaluate both expressions and return the value of the right-hand operand."
int x = ((0, 1), 2); // (0, 1, 2) is also acceptable
If you write it like this, 2 will be assigned to x.
The i++, j++ in the for loop meant "evaluate (calculate) both i++ and j++, and return the result of j++." Since returning a value in the update expression of a for loop has no meaning, it is simply discarded.
If you abuse this, you can even do something like this.
int func(int x) {
return x++, x *= 2, do_stuff(x), x;
}
Quite powerful, isn't it?
Statement Expression
Now, let's talk about GNU extensions.
Although not part of the C standard, it is a feature that the GNU C compiler has implemented as a unique extension because of its convenience.
int x = ({
int tmp = 10;
tmp * tmp;
});
By doing this, the statements written inside ({}) are executed, and the result of the last statement is assigned to the variable. In this example, 10 * 10 = 100 is assigned to x. It looks pretty intense.
You are quite free to do almost anything inside this.
int func(int a) {
return a + 10;
}
int main() {
int x = ({
int a = 10;
int b = 20;
a * func(b);
});
}
You can do things like this, or even...
int main() {
char some_condition = 1;
char x = ({
if (some_condition) {
return 0;
}
'a';
});
}
use return. If you use return, it aborts the initialization of x and exits the function itself.
_Generic
int x = 0;
int a = _Generic(x, int: 1, char: 2, default: 3);
_Generic is a feature that changes the value returned depending on the type of the first argument.
In this case, since x is an int, 1 is assigned to a.
You might have thought, "But x is definitely only ever going to be an int."
The true power of this syntax is demonstrated when combined with #define macros.
#define TYPE_NAME(x) \
_Generic( \
(x), \
int: "int", \
char: "char", \
float: "float", \
double: "double", \
default: "unknown" \
)
int main() {
int i = 0;
const char* type_of_i = TYPE_NAME(i);
char c = 'a';
const char* type_of_c = TYPE_NAME(c);
}
In this case, TYPE_NAME(something) is all expanded as macros, which is the same as writing:
_Generic(
(something),
int: "int",
char: "char",
float: "float",
double: "double",
default: "unknown"
)
In other words, type_of_i will contain "int", and type_of_c will contain "char"!
This means you can create functions (function-like macros) that branch based on the type.
Extra
int main() {
int i = 0;
i = i++;
}
Let's think about what happens in this kind of code.
First, set i to 0.
Next, overwrite i with the value of i plus 1, and then put the value before the overwrite into i (← ?).
What will happen to i?
...
...
...
The correct answer was, "No one knows what will happen!"
In such cases, it becomes Undefined Behavior (UB) in C.
"Undefined behavior" is defined in the C specification.
3.5.3
1 undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this document imposes no requirements
2 Note 1 to entry: Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message)
If I were to translate it, it would look something like this:
3.5.3
1 Undefined behavior
Behavior for which this document imposes no requirements, upon use of a nonportable or erroneous program construct or of erroneous data.
2 Note 1 to entry: Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
The point is, in a program after undefined behavior has occurred, anything can happen: it can issue a warning message or not, it can fail to compile, it can terminate execution, or it can do anything at all.
It's often said as a joke that "as a result of executing undefined code, the compiler can do anything. It wouldn't be against the specification if nasal demons flew out of your nose." That's terrifying.
The End
That's all.
Discussion