C NOTES 2


Storage duration & Linkage

Storage duration

Every object has a property called storage duration, which limits the object lifetime. There are four kinds of storage duration in C:

Linkage

Linkage refers to the ability of an identifier (variable or function) to be referred to in other scopes. If a variable or function with the same identifier is declared in several scopes, but cannot be referred to from all of them, then several instances of the variable are generated. The following linkages are recognized:

If the same identifier appears with both internal and external linkage in the same translation unit, the behavior is undefined. This is possible when tentative definitions are used.

refer

Example:

#include <stdio.h>
        #include <stdlib.h>
        
        /* static storage duration */
        int A;
        
        int main(void)
        {
            printf("&A = %p\n", (void*)&A);
        
            /* automatic storage duration */
            int A = 1;   // hides global A
            printf("&A = %p\n", (void*)&A);
        
            /* allocated storage duration */
            int *ptr_1 = malloc(sizeof(int));   /* start allocated storage duration */
            printf("address of int in allocated memory = %p\n", (void*)ptr_1);
            free(ptr_1);                        /* stop allocated storage duration  */
        
        }
        
OP:
&A = 0x600ae4
        &A = 0x7fffc013de8c
        address of int in allocated memory = 0x217bc30
        

type qualifier

Each individual type in the C type system has several qualified versions of that type, corresponding to one, two, or all three of the const, volatile, and, for pointers to object types, restrict qualifiers. This page describes the effects of the restrict qualifier.

restrict keyword

During each execution of a block in which a restricted pointer P is declared (typically each execution of a function body in which P is a function parameter), if some object that is accessible through P (directly or indirectly) is modified, by any means, then all accesses to that object (both reads and writes) in that block must occur through P (directly or indirectly), otherwise the behavior is undefined:

void f(int n, int * restrict p, int * restrict q)
        {
            while(n-- > 0)
                *p++ = *q++; // none of the objects modified through *p is the same
                             // as any of the objects read through *q
                             // compiler free to optimize, vectorize, page map, etc.
        }
        void g(void)
        {
            extern int d[100];
            f(50, d + 50, d); // OK
            f(50, d + 1, d); // Undefined behavior: d[1] is accessed through both p and q in f
        }
        

Why is the use of alloca() not considered good practice?

The answer is right there in the man page (at least on Linux):

RETURN VALUE The alloca() function returns a pointer to the beginning of the allocated space. If the allocation causes stack overflow, program behaviour is undefined.

Which isn't to say it should never be used. One of the OSS projects I work on uses it extensively, and as long as you're not abusing it (alloca'ing huge values), it's fine. Once you go past the "few hundred bytes" mark, it's time to use malloc and friends, instead. You may still get allocation failures, but at least you'll have some indication of the failure instead of just blowing out the stack.

What's the difference between a VLA and dynamic memory allocation via malloc?

char Buffer[MAX_BUF];

Buffer is an array of size MAX_BUF. The allocation technique is called VLA.

const int MAX_BUF = 1000;

char* Buffer = malloc(MAX_BUF);

Buffer is a pointer which is allocated a memory of size MAX_BUF which is 1000.

and, an array is not the same as a pointer, and C-FAQ has a Very Good collection detailing the reasons.

The major difference, in terms of usability and behaviour are:

Anonymous Structure

Similar to union, an unnamed member of a struct whose type is a struct without name is known as anonymous struct. Every member of an anonymous struct is considered to be a member of the enclosing struct or union. This applies recursively if the enclosing struct or union is also anonymous.

struct v {
           union { // anonymous union
              struct { int i, j; }; // anonymous structure
              struct { long k, l; } w;
           };
           int m;
        } v1;
        
        v1.i = 2;   // valid
        v1.k = 3;   // invalid: inner structure is not anonymous
        v1.w.k = 5; // valid
        

Similar to union, the behavior of the program is undefined if struct is defined without any named members (including those obtained via anonymous nested structs or unions).

Forward declaration

A declaration of the following form

struct name;

hides any previously declared meaning for the name name in the tag name space and declares name as a new struct name in current scope, which will be defined later. Until the definition appears, this struct name has incomplete type.

This allows structs that refer to each other:

struct y;
        struct x { struct y *p; /* ... */ };
        struct y { struct x *q; /* ... */ };
        

Note that a new struct name may also be introduced just by using a struct tag within another declaration, but if a previously declared struct with the same name exists in the tag name space, the tag would refer to that name

struct s* p = NULL; // tag naming an unknown struct declares it
        struct s { int a; }; // definition for the struct pointed to by p
        void g(void)
        {
            struct s; // forward declaration of a new, local struct s
                      // this hides global struct s until the end of this block
            struct s *p;  // pointer to local struct s
                          // without the forward declaration above,
                          // this would point at the file-scope s
            struct s { char* p; }; // definitions of the local struct s
        }
        

Incomplete types

An incomplete type is an object type that lacks sufficient information to determine the size of the objects of that type. An incomplete type may be completed at some point in the translation unit.

The following types are incomplete:

extern char a[]; // the type of a is incomplete (this typically appears in a header)
        char a[10];      // the type of a is now complete (this typically appears in a source file)
        
struct node {
          struct node *next; // struct node is incomplete at this point
        }; // struct node is complete at this point
        

Storage Classes in C

Syntax:

storage_class var_data_type var_name;

C language uses 4 storage classes, namely:

refer

Type Qualifiers

There are four type qualifiers:

Example

demonstrates the use of volatile to disable optimizations

#include <stdio.h>
        #include <time.h>
        
        int main(void)
        {
            clock_t t = clock();
            double d = 0.0;
            for (int n=0; n<10000; ++n)
               for (int m=0; m<10000; ++m)
                   d += d*n*m; // reads and writes to a non-volatile 
            printf("Modified a non-volatile variable 100m times. "
                   "Time used: %.2f seconds\n",
                   (double)(clock() - t)/CLOCKS_PER_SEC);
        
            t = clock();
            volatile double vd = 0.0;
            for (int n=0; n<10000; ++n)
               for (int m=0; m<10000; ++m)
                   vd += vd*n*m; // reads and writes to a volatile 
            printf("Modified a volatile variable 100m times. "
                   "Time used: %.2f seconds\n",
                   (double)(clock() - t)/CLOCKS_PER_SEC);
        }
        

Possible output:

Modified a non-volatile variable 100m times. Time used: 0.00 seconds
        Modified a volatile variable 100m times. Time used: 0.79 seconds
        

Uses of volatile

more

refer

Conditional inclusion

The preprocessor supports conditional compilation of parts of a source file. This behavior is controlled by #if, #else, #elif, #ifdef, #ifndef and #endif directives.

Syntax:

#if expression        
        #ifdef identifier        
        #ifndef identifier        
        #elif expression        
        #else        
        #endif
        

Example:

#define ABCD 2
        #include <stdio.h>
        
        int main(void)
        {
        
        #ifdef ABCD
            printf("1: yes\n");
        #else
            printf("1: no\n");
        #endif
        
        #ifndef ABCD
            printf("2: no1\n");
        #elif ABCD == 2
            printf("2: yes\n");
        #else
            printf("2: no2\n");
        #endif
        
        #if !defined(DCBA) && (ABCD < 2*4-3)
            printf("3: yes\n");
        #endif
        }
        

Output:

1: yes
        2: yes
        3: yes
        

#define macros

Example:

#include <stdio.h>
        
        //make function factory and use it
        #define FUNCTION(name, a) int fun_##name(int x) { return (a)*x;}
        
        FUNCTION(quadruple, 4)
        FUNCTION(double, 2)
        
        #undef FUNCTION
        #define FUNCTION 34
        #define OUTPUT(a) puts( #a )
        
        int main(void)
        {
            printf("quadruple(13): %d\n", fun_quadruple(13) );
            printf("double(21): %d\n", fun_double(21) );
            printf("%d\n", FUNCTION);
            OUTPUT(million);               //note the lack of quotes
        }
        

Output:

quadruple(13): 52
        double(21): 42
        34
        million
        

# and ## operators

In function-like macros, a # operator before an identifier in the replacement-list runs the identifier through parameter replacement and encloses the result in quotes, effectively creating a string literal. In addition, the preprocessor adds backslashes to escape the quotes surrounding embedded string literals, if any, and doubles the backslashes within the string as necessary. All leading and trailing whitespace is removed, and any sequence of whitespace in the middle of the text (but not inside embedded string literals) is collapsed to a single space. This operation is called "stringification". If the result of stringification is not a valid string literal, the behavior is undefined.

When # appears before VA_ARGS, the entire expanded VA_ARGS is enclosed in quotes:

#define showlist(...) puts(#__VA_ARGS__)
        showlist();            // expands to puts("")
        showlist(1, "x", int); // expands to puts("1, \"x\", int")
        

(since C99)

A ## operator between any two successive identifiers in the replacement-list runs parameter replacement on the two identifiers and then concatenates the result.

more

GCC -g vs -g3 GDB Flag: What is the Difference? Also is there a difference between -g and -ggdb?

-g

Produce debugging information in the operating system's native format (stabs, COFF, XCOFF, or DWARF 2). GDB can work with this debugging information. On most systems that use stabs format, -g enables use of extra debugging information that only GDB can use; this extra information makes debugging work better in GDB but probably makes other debuggers crash or refuse to read the program. If you want to control for certain whether to generate the extra information, use -gstabs+, -gstabs, -gxcoff+, -gxcoff, or -gvms (see below).

...

-ggdb

Produce debugging information for use by GDB. This means to use the most expressive format available (DWARF 2, stabs, or the native format if neither of those are supported), including GDB extensions if at all possible.

-gvmslevel

Request debugging information and also use level to specify how much information. The default level is 2. Level 0 produces no debug information at all. Thus, -g0 negates -g.

....

Level 3 includes extra information, such as all the macro definitions present in the program. Some debuggers support macro expansion when you use -g3.

docs.

Debug flags in gcc

-g and -ggdb are similar with some slight differences, I read this here:

-g produces debugging information in the OS¹s native format (stabs, COFF, XCOFF, or DWARF 2).

-ggdb produces debugging information specifically intended for gdb.

-ggdb3 produces extra debugging information, for example: including macro definitions.

-ggdb by itself without specifying the level defaults to -ggdb2 (i.e., gdb for level 2).

Weak Definitions

A function definition (or EXPORTed label in assembler) can also be marked as weak, as can a variable definition. In this case, a weak symbol definition is created in the object file.

A weak definition can be used to resolve any reference to that symbol in the same way as a normal definition. However, if another (non-weak) definition of that symbol exists in the build, the linker will use that definition instead of the weak definition, and not produce an error due to multiply-defined symbols.

Example usage:

A simple or dummy implementation of a function can be provided as a WEAK definition. This allows the software to be built (with defined behaviour) without providing a 'full' implementation of this function, but also allows a full implementation to be provided for some builds if required.

Targeted flattening instead of global inlining

Now for the trick! Both GCC and Clang support __attribute__((flatten)). Putting it on a function causes all of its callees to be inlined into it. It’s dead simple.

void do_thing(int input)
        {
            // this code is not always inlined at the call site
        }
        
        __attribute__((flatten)) void hot_code()
        {
            // the program spends >80% of its runtime in this function
            while (condition) {
                call_something();   // inlined!
                do_thing(y);        // inlined!
                other_thing();      // also inlined!
            }
        }
        
        void cool_code()
        {
            // the program spends <5% of its runtime in this function
            ...
            do_thing(a);            // not inlined!
            do_thing(b);            // not inlined!
            do_thing(c);            // guess!
        }
        

Note: Functions with __attribute__((noinline)) will not be inlined. The same goes for functions where the compiler can’t see the body.

In conclusion

__attribute__((flatten))lets you opt in to the pros of aggressive inlining on a per-function basis, while protecting the rest of your program from the cons!

Initializing entire 2D array with one value

With the following declaration

int array[ROW][COLUMN]={0};

I get the array with all zeroes but with the following one

int array[ROW][COLUMN]={1};

I don’t get the array with all one value. The default value is still 0.

Why this behavior and how can I initialize with all 1?


You get this behavior, because int array [ROW][COLUMN] = {1};does not mean "set all items to one". Let me try to explain how this works step by step.

The explicit, overly clear way of initializing your array would be like this:

#define ROW 2
        #define COLUMN 2
        
        int array [ROW][COLUMN] =
        {
          {0, 0},
          {0, 0}
        };
        

However, C allows you to leave out some of the items in an array (or struct/union). You could for example write:

int array [ROW][COLUMN] =
        {
          {1, 2}
        };
        

This means, initialize the first elements to 1 and 2, and the rest of the elements "as if they had static storage duration".

There is a rule in C saying that all objects of static storage duration, that are not explicitly initialized by the programmer, must be set to zero.

So in the above example, the first row gets set to 1,2 and the next to 0,0 since we didn't give them any explicit values.

Next, there is a rule in C allowing lax brace style. The first example could as well be written as

int array [ROW][COLUMN] = {0, 0, 0, 0};

although of course this is poor style, it is harder to read and understand. But this rule is convenient, because it allows us to write

int array [ROW][COLUMN] = {0};

which means: "initialize the very first column in the first row to 0, and all other items as if they had static storage duration, ie set them to zero."

therefore, if you attempt

int array [ROW][COLUMN] = {1};

it means "initialize the very first column in the first row to 1 and set all other items to zero".

What is the reason for explicitly declaring L or UL for long values?

Indirect use of L or UL type details in macro or compile time constant expressions

refer

#include <stdio.h> 
        #include <stdint.h>
        
        #define BMP280_DATA_UPDATE_FREQ   (1000)
        #define BMP280_DATA_UPDATE_FREQ_U (1000U)
        
        int main()  { 
        
            unsigned short prev_val1 = 0xFFFF;
            unsigned short cur_val1 = 1000;
        
            printf("%d\n", (cur_val1 - prev_val1));
        
            if((cur_val1 - prev_val1) > BMP280_DATA_UPDATE_FREQ)
                printf("Works with BMP280_DATA_UPDATE_FREQ\n");
        
            if((cur_val1 - prev_val1) > BMP280_DATA_UPDATE_FREQ_U)
                printf("Works with BMP280_DATA_UPDATE_FREQ_U\n");
        
            return 0;
        }
        

Output:

-64535
        Works with BMP280_DATA_UPDATE_FREQ_U
        

In the above example the type of the macro BMP280_DATA_UPDATE_FREQ also affects the behaviour of the if expression thus adding required type information to macro can force the operation to be in the required type.

In the first if above, the LHS is calculated as int and type of the macro value is int as well (C standard). The comparisson becomes -64535 > 1000 which is false.

But in the 2nd case the LHS is calculated as int and RHS is unsigned int thus the comparisson is done in unsigned int. So the int in LHS is casted to unsigned int, ie -64535 ==> 2's Complement of 64535 ==> 1001. 1001 > 1000 -> True.


When a suffix L or UL is not used, the compiler uses the first type that can contain the constant from a list (see details in sC99 standard, clause 6.4.4:5. For a decimal constant, the list is int, long int, long long int).

As a consequence, most of the times, it is not necessary to use the suffix. It does not change the meaning of the program. It does not change the meaning of your example initialization of x for most architectures, although it would if you had chosen a number that could not be represented as a long long. See also codebauer's answer for an example where the U part of the suffix is necessary.

There are a couple of circumstances when the programmer may want to set the type of the constant explicitly. One example is when using a variadic function:

printf("%lld", 1LL); // correct, because 1LL has type long long
        printf("%lld", 1);   // undefined behavior, because 1 has type int
        

A common reason to use a suffix is ensuring that the result of a computation doesn't overflow. Two examples are:

long x = 10000L * 4096L;
        unsigned long long y = 1ULL << 36;
        

In both examples, without suffixes, the constants would have type int and the computation would be made as int. In each example this incurs a risk of overflow. Using the suffixes means that the computation will be done in a larger type instead, which has sufficient range for the result.

As Lightness Races in Orbit puts it, the litteral's suffix comes before the assignment. In the two examples above, simply declaring x as long and y as unsigned long long is not enough to prevent the overflow in the computation of the expressions assigned to them.

Another example is the comparison x < 12U where variable x has type int. Without the U suffix, the compiler types the constant 12 as an int, and the comparison is therefore a comparison of signed ints.

int x = -3;
        printf("%d\n", x < 12); // prints 1 because it's true that -3 < 12
        

With the U suffix, the comparison becomes a comparison of unsigned ints. “Usual arithmetic conversions” mean that -3 is converted to a large unsigned int:

printf("%d\n", x < 12U); // prints 0 because (unsigned int)-3 is large

In fact, the type of a constant may even change the result of an arithmetic computation, again because of the way “usual arithmetic conversions” work.

Note that, for decimal constants, the list of types suggested by C99 does not contain unsigned long long. In C90, the list ended with the largest standardized unsigned integer type at the time (which was unsigned long). A consequence was that the meaning of some programs was changed by adding the standard type long long to C99: the same constant that was typed as unsigned long in C90 could now be typed as a signed long long instead. I believe this is the reason why in C99, it was decided not to have unsigned long long in the list of types for decimal constants.

How are negative signed values stored?

src

The C standard doesn't mandate any particular way of representing negative signed numbers.

In most implementations that you are likely to encounter, negative signed integers are stored in what is called two's complement. The other major way of storing negative signed numbers is called one's complement.

The two's complement of an N-bit number x is defined as 2^N - x. For example, the two's complement of 8-bit 1 is 2^8 - 1, or 1111 1111. The two's complement of 8-bit 8 is 2^8 - 8, which in binary is 1111 1000. This can also be calculated by flipping the bits of x and adding one. For example:

more

 1      = 0000 0001
        ~1      = 1111 1110 (1's complement)
        ~1 + 1  = 1111 1111 (2's complement)
        -1      = 1111 1111
        
         21     = 0001 0101
        ~21     = 1110 1010
        ~21 + 1 = 1110 1011
        -21     = 1110 1011
        

An easier method to get the negation of a number in two's complement is as follows:

Example:

Example 1    Example 2
        00101001    00101100
        11010111    11010100
        
Binary value                        0   1   10  11  100 ..  ... ..  01111111 10000000   10000001    ..  ...     11111110    11111111
        Two's complement interpretation        0   1   2   3   4   ..  ... ..  127      -128       -127        ..  ...     -2          -1        
        Unsigned interpretation             0   1   2   3   4   ..  ... ..  127       128       129         ..  ...     254         255
        

Here is the process to convert a negative two's complement number back to decimal:

The one's complement of an N-bit number x is defined as x with all its bits flipped, basically.

 1      = 0000 0001
        -1      = 1111 1110
        
         21     = 0001 0101
        -21     = 1110 1010
        

Two's complement has several advantages over one's complement. For example, it doesn't have the concept of 'negative zero', which for good reason is confusing to many people. Addition, multiplication and subtraction work the same with signed integers implemented with two's complemented as they do with unsigned integers as well.

Signed magnitude.

This is the easiest to understand, because it works the same as we are used to when dealing with negative decimal values: The first position (bit) represents the sign (0 for positive, 1 for negative), and the other bits represent the number. Although it is easy for us to understand, it is hard for computers to work with, especially when doing arithmetic with negative numbers. In 8-bit signed magnitude, the value 8 is represented as 0 0001000 and -8 as 1 0001000.

What is the default data type of number in C?

src

When using a decimal constant without any suffixes the type of the decimal constant is the first that can be represented, in order (the current C standard, 6.4.4 Constants p5):

Example:

In C,
        
        unsigned int size = 1024*1024*1024*2;
        which results a warning "integer overflow in expression..." While
        
        unsigned int size = 2147483648;
        results no warning?
        
        Is the right value of the first expression is default as int? Where does it mention in C99 spec?
        

The type of the first expression is int, since values 1024 and 2 can be represented as int. The computation of those constants will be done in type int, and the result will overflow. This expression 1024*1024*1024*2(in the expression 1024 and 2 are of type signed int) produces result that is of type signed int and this value is too big for signed int .

Assuming INT_MAX equals 2147483647 and LONG_MAX is greater than 2147483647, the type of the second expression is long int, since this value cannot be represented as int, but can be as long int. If INT_MAX equals LONG_MAX equals 2147483647, then the type is long long int.

Is unsigned integer subtraction defined behavior?

The result of a subtraction generating a negative number in an unsigned type is well-defined:

[...] A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type. (ISO/IEC 9899:1999 (E) §6.2.5/9)

As you can see, (unsigned)0 - (unsigned)1 equals -1 modulo UINT_MAX+1, or in other words, UINT_MAX.

Example:

the way unsigned subtraction works for uint16_t -

0 - 1000 != 1000

0 - 1000 == -1000 mod (65535 + 1)== 64536

0 - 64535 == -64535 mod (65535 + 1) = 1001

#include <stdio.h>
        #include <stdint.h>
        
        int main() {
            uint16_t a = 0, b = 1000;
            uint16_t c = a - b;
        
            uint16_t a1 = 0, b1 = 64535;
            uint16_t c1 = a1 - b1;
        
            printf("%d %d\n", c, c1);
            return 0;
        }
        

Output:

64536 1001

Implicit type promotion rules

src

Promotion is the process by which values of integer type "smaller" that int/unsigned int are converted either to int or unsigned int. The rules are expressed somewhat strangely (mostly for the benefit of handling adequately char) but ensure that value and sign are conserved.

Few examples of effects of promotion rules:

Example 1) - Why does this give a strange, large integer number and not 255?

unsigned char x = 0;
        unsigned char y = 1;
        printf("%u\n", x - y);
        

Example 2) - Why does this give "-1 is larger than 0"?

unsigned int a = 1;
        signed int b = -2;
        if(a + b > 0)
          puts("-1 is larger than 0");
        

Example 3) - Why does changing the type in the above example to short fix the problem?

unsigned short a = 1;
        signed short b = -2;
        if(a + b > 0)
          puts("-1 is larger than 0"); // will not print
        

(These examples were intended for a 32 or 64 bit computer with 16 bit short.)


src

C was designed to implicitly and silently change the integer types of the operands used in expressions. There exist several cases where the language forces the compiler to either change the operands to a larger type, or to change their signedness.

The rationale behind this is to prevent accidental overflows during arithmetic, but also to allow operands with different signedness to co-exist in the same expression.

Unfortunately, the rules for implicit type promotion cause much more harm than good, to the point where they might be one of the biggest flaws in the C language. These rules are often not even known by the average C programmer and therefore causing all manner of very subtle bugs.

Typically you see scenarios where the programmer says "just cast to type x and it works" - but they don't know why. Or such bugs manifest themselves as rare, intermittent phenomenon striking from within seemingly simple and straight-forward code. Implicit promotion is particularly troublesome in code doing bit manipulations, since most bit-wise operators in C come with poorly-defined behavior when given a signed operand.

Integer types and conversion rank

The integer types in C are char, short, int, long, long long and enum. _Bool/bool is also treated as an integer type when it comes to type promotions.

All integers have a specified conversion rank. C11 6.3.1.1, emphasis mine on the most important parts:

Every integer type has an integer conversion rank defined as follows:

The types from stdint.h sort in here too, with the same rank as whatever type they happen to correspond to on the given system. For example, int32_t has the same rank as int on a 32 bit system.

Further, C11 6.3.1.1 specifies which types that are regarded as the small integer types (not a formal term):

The following may be used in an expression wherever an int or unsigned int may be used:

— An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.

What this somewhat cryptic text means in practice, is that _Bool, char and short (and also int8_t, uint8_t etc) are the "small integer types". These are treated in special ways and subject to implicit promotion, as explained below.

The integer promotions

Whenever a small integer type is used in an expression, it is implicitly converted to int which is always signed. This is known as the integer promotions or the integer promotion rule.

Formally, the rule says (C11 6.3.1.1):

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.

This means that all small integer types, no matter signedness, get implicitly converted to (signed) int when used in most expressions.

This text is often misunderstood as: "all small, signed integer types are converted to signed int and all small, unsigned integer types are converted to unsigned int". This is incorrect. The unsigned part here only means that if we have for example an unsigned short operand, and int happens to have the same size as short on the given system, then the unsigned short operand is converted to unsigned int. As in, nothing of note really happens. But in case short is a smaller type than int, it is always converted to (signed) int, regardless of it the short was signed or unsigned!

The harsh reality caused by the integer promotions means that almost no operation in C can be carried out on small types like char or short. Operations are always carried out on int or larger types.

This might sound like nonsense, but luckily the compiler is allowed to optimize the code. For example, an expression containing two unsigned char operands would get the operands promoted to int and the operation carried out as int. But the compiler is allowed to optimize the expression to actually get carried out as an 8 bit operation, as would be expected. However, here comes the problem: the compiler is not allowed to optimize out the implicit change of signedness caused by the integer promotion. Because there is no way for the compiler to tell if the programmer is purposely relying on implicit promotion to happen, or if it is unintentional.

This is why example 1 in the question fails. Both unsigned char operands are promoted to type int, the operation is carried out on type int, and the result of x - y is of type int. Meaning that we get -1 instead of 255 which might have been expected. The compiler may generate machine code that executes the code with 8 bit instructions instead of int, but it may not optimize out the change of signedness. Meaning that we end up with a negative result, that in turn results in a weird number when printf("%u is invoked. Example 1 could be fixed by casting the result of the operation back to type unsigned char.

With the exception of a few special cases like ++ and sizeof operators, the integer promotions apply to almost all operations in C, no matter if unary, binary (or ternary) operators are used.

The usual arithmetic conversions

Usual arithmetic conversion by which operands of arithmetic operators are converted to a common type. It begins by promoting the operand (to either int or unsigned) if they are of a type smaller than int and then choosing a target type by the following process (for integer types, 6.3.1.8/1) (Listed below as bullets).

Whenever a binary operation (an operation with 2 operands) is done in C, both operands of the operator have to be of the same type. Therefore, in case the operands are of different types, C enforces an implicit conversion of one operand to the type of the other operand. The rules for how this is done are named the usual artihmetic conversions (sometimes informally referred to as "balancing"). These are specified in C11 6.3.18:

(Think of this rule as a long, nested if-else if statement and it might be easier to read :) )

6.3.1.8 Usual arithmetic conversions

Many operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to determine a common real type for the operands and result. For the specified operands, each operand is converted, without change of type domain, to a type whose corresponding real type is the common real type. Unless explicitly stated otherwise, the common real type is also the corresponding real type of the result, whose type domain is the type domain of the operands if they are the same, and complex otherwise. This pattern is called the usual arithmetic conversions:

Notable here is that the usual arithmetic conversions apply to both floating point and integer variables. In case of integers, we can also note that the integer promotions are invoked from within the usual arithmetic conversions. And after that, when both operands have at least the rank of int, the operators are balanced to the same type, with the same signedness.

This is the reason why a + b in example 2 gives a strange result. Both operands are integers and they are at least of rank int, so the integer promotions do not apply. The operands are not of the same type - a is unsigned int and b is signed int. Therefore the operator b is temporarily converted to type unsigned int. During this conversion it loses the sign information and ends up as a large value.

The reason why changing type to short in example 3 fixes the problem, is because short is a small integer type. Meaning that both operands are integer promoted to type int which is signed. After integer promotion, both operands have the same type (int), no further conversion is needed. And then the operation can be carried out on a signed type as expected.

Implicit type promotion rules - My explanation for examples in SO question about Implicit type promotion rules

unsigned char x = 0;
        unsigned char y = 1;
        printf("%u\n", x - y);
        

Ans:

Based on the integer promotion rules both x and y are small integer types, they are type casted implicitly to int before - operation. But as the x - y expression is a temporary result in printf() and is not assigned to another typed variable, the evaluted result of the expression still stays as int. And thus int 0 - int 1 = int 0xFFFFFFFF and thats why it prints 0xFFFFFFFF.

unsigned int a = 1;
        signed int b = -2;
        if(a + b > 0)
            puts("-1 is larger than 0");
        

Ans:

As both a and b are having types are having at least the rank of int, so the integer promotions do not apply. Based on the usual arithmetic conversions, b is typecasted to unsigned int. Thus the expression is evaluted as and to unsigned int. In the comparisson operation the value 0 is of type int, so 0 is also converted to the type unsigned int. Thus, causing a greater value (1 + 0xFFFFFFFE = 0xFFFFFFFF) at the LHS of > operator.

unsigned short a = 1;
        signed short b = -2;
        if(a + b > 0)
            puts("-1 is larger than 0"); // will not print
        

Ans:

As both a and b are having types with rank less than rank of int the integer promotions do apply. Both a and b are converted to int before operation. Thus the expression is evaluted as and to int. Causing a lesser value (1 + -2 = -1) at the LHS of > operator.

Implicit type promotion rules and The usual arithmetic conversions - Examples

refer

  1. If both operands have the same type, then no further conversion is needed.

Example:

int a;
        int b;
        
        no convertion
        
  1. Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.

Example:

long a;
        long long b;
        
        a is converted to long long
        
  1. Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.

Example:

unsigned int a;
        int b;
        
        b will be converted to unsigned int. Same example as example 2 of just above topic
        
  1. Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.

Example:

unsigned int a;      // range: 0 to 4294967295
        long b;              // range: -9223372036854775808 to 9223372036854775807
        
        a will be converted to long, as long can represent all of the values of the type of the operand with unsigned integer type, if int is 32 bits and long is 64 bits.
        
  1. Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.

Example:

typically long is 64-bit, and long long is 64-bit.
        
        unsigned long c;     // range: 0 to 18446744073709551615
        long long d;         // range: -9223372036854775808 to 9223372036854775807
        
        Both c and d are converted to unsigned long long as all values of unsigned long type cannot be represented using long long, because both of them are 64 bits for a system.
        

How is typecasting done bit level

refer

Regarding what happens during a promotion / conversion on the bit level, let's first assume that the lower rank type is smaller than the higher rank type, and that signed types use 2's complement representation.

For a conversion from a 32 bit int to a 64 bit long, if the value is positive, 4 bytes containing all 0 bits are added on the left. If the value is negative, 4 bytes containing all 1 bits are added on the left. For example, the representation of value 5 changes from 0x00000005 to 0x0000000000000005. For the value -5, the representation changes from 0xfffffffb to 0xfffffffffffffffb.

Example for Integer Promotions and Usual Arithmetic Conversions

#include <stdio.h> 
        #include <stdint.h>
        
        #define BMP280_DATA_UPDATE_FREQ (1000)
        #define BMP280_DATA_UPDATE_FREQ__U (1000U)
        
        int main()  { 
        
            uint32_t prev_val = 0xFFFFFFFF;
            uint32_t cur_val = 1000;
        
            /* If an int can represent all values of the original type
             (as restricted by the width, for a bit-field), 
             the value is converted to an int; otherwise, 
             it is converted to an unsigned int. 
             These are called the integer promotions. */
        
            /* Here the below expression is evaluated as unisigned int (it int is 32 bits)
            itself as all values of uint32_t cannot be represented using int. */
            uint32_t ress = cur_val - prev_val;
            printf("%u\n", ress);
            printf("%u\n", cur_val - prev_val);
            printf("%u\n", sizeof(unsigned int) == sizeof(uint32_t));
        
            /* If an int can represent all values of the original type
            (as restricted by the width, for a bit-field), 
            the value is converted to an int; otherwise, 
            it is converted to an unsigned int. 
            These are called the integer promotions. 
        
            The unsigned part here only means that if we have for example an
            unsigned short operand, and int happens to have the same size 
            as short on the given system, then the unsigned short operand 
            is converted to unsigned int */
        
            /* Thus the (cur_val - prev_val) evaluates to an unsigned int (as uint32 
            and int are of same size, refer above para.) and the 'if' becomes true, even if 
            BMP280_DATA_UPDATE_FREQ doesnt have 'U' at the end. BMP280_DATA_UPDATE_FREQ will
            be cast to unsigned int based on usual arithemetic conversions. */
            if((cur_val - prev_val) > BMP280_DATA_UPDATE_FREQ)
                printf("Works with BMP280_DATA_UPDATE_FREQ\n");
        
            uint16_t prev_val1 = 0xFFFF;
            uint16_t cur_val1 = 1000;
        
            int inter_res = cur_val1 - prev_val1;
            printf("%X %d\n", inter_res, inter_res);
            /* inter_res = FFFF03E9, this value is -64535 (obtained by 
            taking 2s compliment of FFFF03E9 and adding sign) */
        
            /* -64535 cannot be represeted in uint16_t its reduced by modulo 
            the number that is one greater than the largest value that can 
            be represented by the resulting type, ie UINT16_T_MAX + 1
            ==>  -64535 mod (65535 + 1) = 1001 */
            uint16_t ress1 = inter_res;
            uint16_t ress2 = cur_val1 - prev_val1;
            printf("%u %u\n", ress1, ress2);
        
            /* Here the (cur_val - prev_val) evaluates to an int (integer promotion) 
            and the 'if' becomes false, as BMP280_DATA_UPDATE_FREQ is also int (refer next line), 
            no additional casting is done. 
        
            (When using a decimal 
            constant without any suffixes the type of the decimal constant 
            is the first that can be represented, in order 
            (the current C standard, 6.4.4 Constants p5):
            int
            long int
            long long int) 
            */
            if((cur_val1 - prev_val1) > BMP280_DATA_UPDATE_FREQ)
                printf("Works\n");
        
            /* Here the (cur_val - prev_val) evaluates to an int (integer promotion) 
            and the 'if' becomes true, becuase as the BMP280_DATA_UPDATE_FREQ__U is unsigned
            (cur_val - prev_val) also be casted to unisgned int. based on usual arithemetic conversions. */
            if((cur_val1 - prev_val1) > BMP280_DATA_UPDATE_FREQ__U)
                printf("Works with BMP280_DATA_UPDATE_FREQ__U\n");
        
            return 0;
        }
        

Output:

1001
        1001
        1
        Works with BMP280_DATA_UPDATE_FREQ
        FFFF03E9 -64535
        1001 1001
        Works with BMP280_DATA_UPDATE_FREQ__U
        

Comparisson between float and double

float f = 0.7;
        if( f == 0.7 )
            printf("equal");
        else
            printf("not equal");
        

Why is the output not equal ?


src

This happens because in your statement

if(f == 0.7)

the 0.7 is treated as a double.

you should never test for exact equality of floating-point values.

More elaborate explanation:

In the line float f = 0.7; the double value 0.7 is converted to float first. Then in if(f == 0.7) f is converted to double as RHS 0.7 is also double, based on usual arithemetic conversion rules (refer above).

Note that 0.7 is not representable exactly either as a float (or as a double). If it was represented exactly, then there would be no loss of information when converting to float and then back to double, and you wouldn't have this problem.

All non-integer numbers that can be represented exactly have 5 as their last decimal digit. Unfortunately, the converse is not true: some numbers have 5 as their last decimal digit and cannot be represented exactly.

Why comparing double and float leads to unexpected result?

The important factors under consideration with float or double numbers are: Precision & Rounding

Precision:

The precision of a floating point number is how many digits it can represent without losing any information it contains.

Consider the fraction 1/3. The decimal representation of this number is 0.33333333333333… with 3′s going out to infinity. An infinite length number would require infinite memory to be depicted with exact precision, but float or double data types typically only have 4 or 8 bytes. Thus Floating point & double numbers can only store a certain number of digits, and the rest are bound to get lost. Thus, there is no definite accurate way of representing float or double numbers with numbers that require more precision than the variables can hold.

Rounding:

There is a non-obvious differences between binary and decimal (base 10) numbers. Consider the fraction 1/10. In decimal, this can be easily represented as 0.1, and 0.1 can be thought of as an easily representable number. However, in binary, 0.1 is represented by the infinite sequence: 0.00011001100110011…

An example:

#include <iomanip>
        int main()
        {
            using namespace std;
            cout << setprecision(17);
            double dValue = 0.1;
            cout << dValue << endl;
        }
        

This output is:

0.10000000000000001

And not

0.1.

This is because the double had to truncate the approximation due to it’s limited memory, which results in a number that is not exactly 0.1. Such an scenario is called a Rounding error.

Whenever comparing two close float and double numbers such rounding errors kick in and eventually the comparison yields incorrect results and this is the reason you should never compare floating point numbers or double using ==.

The best you can do is to take their difference and check if it is less than an epsilon.

abs(x - y) < epsilon

Example: Why comparing double and float leads to unexpected result?

#include <stdio.h> 
        
        int main() 
        { 
        float f = 0.5;
        double f1 = f;
        
        if( f == 0.5 ) // 0.5 is double
            printf("equal\n");
        else
            printf("not equal\n");
        
        if( f1 == 0.5 ) // 0.5 is double, wont work with numbers that doesnt end with 5 
            printf("equal\n");
        else
            printf("not equal\n");
        }
        

Thread local data in C

refer1

refer2

Thread-local storage (TLS) is a mechanism by which variables are allocated such that there is one instance of the variable per extant thread. The runtime model GCC uses to implement this originates in the IA-64 processor-specific ABI, but has since been migrated to other processors as well. I

At the user level, the extension is visible with a new storage class keyword: __thread. For example:

__thread int i;
        extern __thread struct state s;
        static __thread char *p;
        

The thread specifier may be used alone, with the extern or static specifiers, but with no other storage class specifier. When used with extern or static, thread must appear immediately after the other storage class specifier.

In C++, if an initializer is present for a thread-local variable, it must be a constant-expression, as defined in 5.19.2 of the ANSI/ISO C++ standard.

Example:

#include <stdlib.h>
        #include <stdio.h>
        #include <string.h>
        
        #include <pthread.h>
        
        
        void *test (void *arg) 
        {
            static __thread int val = 0;
            static __thread char *string = NULL;
        
            string = (char *) calloc (100, sizeof (char));
        
            strcpy (string, "hello");
        
            val++;
        
            printf ("val(%p):%d\n", &val, val);
            printf ("string(%p):%s\n", &string, string);
        
            pthread_exit (NULL);
        }
        
        
        int main (int argc, char *argv[])
        {
            int num_threads = 3, i;
            pthread_t tid[num_threads];
        
            for (i=0;i<num_threads;i++) {
                pthread_create (&tid[i], NULL, &test, NULL);
            }
        
            for (i=0;i<num_threads;i++) {
                pthread_join (tid[i], NULL);
            }
        
            return 0;
        }
        

Output with __thread after static storage class specifier:

val(0xf6d8cb38):1 string(0xf6d8cb3c):hello val(0xf758db38):1 string(0xf758db3c):hello val(0xf7d8eb38):1 string(0xf7d8eb3c):hello

Output without __thread:

val(0xf7cd234c):1 string(0xf7cd2348):hello val(0xf6cd034c):1 string(0xf6cd0348):hello val(0xf74d134c):1 string(0xf74d1348):hello

string var has different address because its allocated by calloc which produce different mem mapping for each call.