Every object has a property called storage duration, which limits the object lifetime. There are four kinds of storage duration in C:
automatic
storage duration. The storage is allocated when the block in which the object
was declared is entered and deallocated when it is exited by any means (goto, return, reaching the
end). One exception is the VLAs; their storage is allocated when the declaration is executed, not on
block entry, and deallocated when the declaration goes out of scope, not than when the block is
exited (since C99). If the block is entered recursively, a new allocation is performed for every
recursion level. All function parameters and non-static block-scope objects have this storage
duration, as well as compound literals used at block scope.
static
storage duration. The storage duration is the entire execution of the program,
and the value stored in the object is initialized only once, prior to main function. All objects
declared static and all objects with either internal or external linkage that aren't declared
_Thread_local (since C11) have this storage duration.
thread
storage duration. The storage duration is the entire execution of the thread in
which it was created, and the value stored in the object is initialized when the thread is started.
Each thread has its own, distinct, object. If the thread that executes the expression that accesses
this object is not the thread that executed its initialization, the behavior is
implementation-defined. All objects declared _Thread_local have this storage duration.
(since C11)
allocated
storage duration. The storage is allocated and deallocated on request, using
dynamic memory allocation functions.
Linkage refers to the ability of an identifier (variable or function) to be referred to in other scopes. If a variable or function with the same identifier is declared in several scopes, but cannot be referred to from all of them, then several instances of the variable are generated. The following linkages are recognized:
no linkage
. The identifier can be referred to only from the scope it is in. All function
parameters and all non-extern block-scope variables (including the ones declared static) have this
linkage.
internal linkage
. The identifier can be referred to from all scopes in the current
translation unit. All static file-scope identifiers (both functions and variables) have this
linkage.
external linkage
. The identifier can be referred to from any other translation units in
the entire program. All non-static functions, all extern variables (unless earlier declared static),
and all file-scope non-static variables have this linkage.
If the same identifier appears with both internal and external linkage in the same translation unit, the behavior is undefined. This is possible when tentative definitions are used.
#include <stdio.h>
#include <stdlib.h>
/* static storage duration */
int A;
int main(void)
{
printf("&A = %p\n", (void*)&A);
/* automatic storage duration */
int A = 1; // hides global A
printf("&A = %p\n", (void*)&A);
/* allocated storage duration */
int *ptr_1 = malloc(sizeof(int)); /* start allocated storage duration */
printf("address of int in allocated memory = %p\n", (void*)ptr_1);
free(ptr_1); /* stop allocated storage duration */
}
&A = 0x600ae4
&A = 0x7fffc013de8c
address of int in allocated memory = 0x217bc30
Each individual type in the C type system has several qualified versions of that type, corresponding to one, two, or all three of the const, volatile, and, for pointers to object types, restrict qualifiers. This page describes the effects of the restrict qualifier.
During each execution of a block in which a restricted pointer P is declared (typically each execution of a function body in which P is a function parameter), if some object that is accessible through P (directly or indirectly) is modified, by any means, then all accesses to that object (both reads and writes) in that block must occur through P (directly or indirectly), otherwise the behavior is undefined:
void f(int n, int * restrict p, int * restrict q)
{
while(n-- > 0)
*p++ = *q++; // none of the objects modified through *p is the same
// as any of the objects read through *q
// compiler free to optimize, vectorize, page map, etc.
}
void g(void)
{
extern int d[100];
f(50, d + 50, d); // OK
f(50, d + 1, d); // Undefined behavior: d[1] is accessed through both p and q in f
}
The answer is right there in the man page (at least on Linux):
RETURN VALUE The alloca() function returns a pointer to the beginning of the allocated space. If the allocation causes stack overflow, program behaviour is undefined.
Which isn't to say it should never be used. One of the OSS projects I work on uses it extensively, and as long as you're not abusing it (alloca'ing huge values), it's fine. Once you go past the "few hundred bytes" mark, it's time to use malloc and friends, instead. You may still get allocation failures, but at least you'll have some indication of the failure instead of just blowing out the stack.
char Buffer[MAX_BUF];
Buffer is an array of size MAX_BUF. The allocation technique is called VLA.
const int MAX_BUF = 1000;
char* Buffer = malloc(MAX_BUF);
Buffer is a pointer which is allocated a memory of size MAX_BUF which is 1000.
and, an array is not the same as a pointer, and C-FAQ has a Very Good collection detailing the reasons.
The major difference, in terms of usability and behaviour are:
Similar to union, an unnamed member of a struct whose type is a struct without name is known as anonymous struct. Every member of an anonymous struct is considered to be a member of the enclosing struct or union. This applies recursively if the enclosing struct or union is also anonymous.
struct v {
union { // anonymous union
struct { int i, j; }; // anonymous structure
struct { long k, l; } w;
};
int m;
} v1;
v1.i = 2; // valid
v1.k = 3; // invalid: inner structure is not anonymous
v1.w.k = 5; // valid
Similar to union, the behavior of the program is undefined if struct is defined without any named members (including those obtained via anonymous nested structs or unions).
A declaration of the following form
struct name;
hides any previously declared meaning for the name name in the tag name space and declares name as a new struct name in current scope, which will be defined later. Until the definition appears, this struct name has incomplete type.
This allows structs that refer to each other:
struct y;
struct x { struct y *p; /* ... */ };
struct y { struct x *q; /* ... */ };
Note that a new struct name may also be introduced just by using a struct tag within another declaration, but if a previously declared struct with the same name exists in the tag name space, the tag would refer to that name
struct s* p = NULL; // tag naming an unknown struct declares it
struct s { int a; }; // definition for the struct pointed to by p
void g(void)
{
struct s; // forward declaration of a new, local struct s
// this hides global struct s until the end of this block
struct s *p; // pointer to local struct s
// without the forward declaration above,
// this would point at the file-scope s
struct s { char* p; }; // definitions of the local struct s
}
An incomplete type is an object type that lacks sufficient information to determine the size of the objects of that type. An incomplete type may be completed at some point in the translation unit.
The following types are incomplete:
extern char a[]; // the type of a is incomplete (this typically appears in a header)
char a[10]; // the type of a is now complete (this typically appears in a source file)
struct node {
struct node *next; // struct node is incomplete at this point
}; // struct node is complete at this point
Syntax:
storage_class var_data_type var_name;
C language uses 4 storage classes, namely:
auto
: This is the default storage class for all the variables declared inside a function or
a block. Hence, the keyword auto is rarely used while writing programs in C language. Auto variables can
be only accessed within the block/function they have been declared and not outside them (which defines
their scope). Of course, these can be accessed within nested blocks within the parent block/function in
which the auto variable was declared. However, they can be accessed outside their scope as well using
the concept of pointers given here by pointing to the very exact memory location where the variables
resides. They are assigned a garbage value by default whenever they are declared.extern
: Extern storage class simply tells us that the variable is defined elsewhere and not
within the same block where it is used. Basically, the value is assigned to it in a different block and
this can be overwritten/changed in a different block as well. So an extern variable is nothing but a
global variable initialized with a legal value where it is declared in order to be used elsewhere. It
can be accessed within any function/block. Also, a normal global variable can be made extern as well by
placing the ‘extern’ keyword before its declaration/definition in any function/block. This basically
signifies that we are not initializing a new variable but instead we are using/accessing the global
variable only. The main purpose of using extern variables is that they can be accessed between two
different files which are part of a large program. For more information on how extern variables work,
have a look at this link.static
: This storage class is used to declare static variables which are popularly used
while writing programs in C language. Static variables have a property of preserving their value even
after they are out of their scope! Hence, static variables preserve the value of their last use in their
scope. So we can say that they are initialized only once and exist till the termination of the program.
Thus, no new memory is allocated because they are not re-declared. Their scope is local to the function
to which they were defined. Global static variables can be accessed anywhere in the program. By default,
they are assigned the value 0 by the compiler.register
: This storage class declares register variables which have the same functionality
as that of the auto variables. The only difference is that the compiler tries to store these variables
in the register of the microprocessor if a free register is available. This makes the use of register
variables to be much faster than that of the variables stored in the memory during the runtime of the
program. If a free register is not available, these are then stored in the memory only. Usually few
variables which are to be accessed very frequently in a program are declared with the register keyword
which improves the running time of the program. An important and interesting point to be noted here is
that we cannot obtain the address of a register variable using pointers.There are four type qualifiers:
__restrict (pointer type only)
const Type Qualifier
- Use the const type qualifier to qualify an object whose value
cannot be changed. Objects qualified by the const keyword cannot be modified. This means that an
object declared as const cannot serve as the operand in an operation that changes its value; for
example, the ++ and -- operators are not allowed on objects qualified with const . Using the const
qualifier on an object protects it from the side effects caused by operations that alter storage.
volatile
- Every access (both read and write) made through an lvalue expression of
volatile-qualified type is considered an observable side effect for the purpose of optimization and
is evaluated strictly according to the rules of the abstract machine (that is, all writes are
completed at some time before the next sequence point). This means that within a single thread of
execution, a volatile access cannot be optimized out or reordered relative to another visible side
effect that is separated by a sequence point from the volatile access.
A cast of a non-volatile value to a volatile type has no effect. To access a non-volatile object
using volatile semantics, its address must be cast to a pointer-to-volatile and then the access must
be made through that pointer.
Any attempt to read or write to an object whose type is volatile-qualified through a non-volatile
lvalue results in undefined behavior:
volatile int n = 1; // object of volatile-qualified type
int* p = (int*)&n;
int val = *p; // undefined behavior
A member of a volatile-qualified structure or union type acquires the qualification of the type it belongs to (both when accessed using the . operator or the -> operator):
struct s { int i; const int ci; } s;
// the type of s.i is int, the type of s.ci is const int
volatile struct s vs;
// the types of vs.i and vs.ci are volatile int and const volatile int
demonstrates the use of volatile to disable optimizations
#include <stdio.h>
#include <time.h>
int main(void)
{
clock_t t = clock();
double d = 0.0;
for (int n=0; n<10000; ++n)
for (int m=0; m<10000; ++m)
d += d*n*m; // reads and writes to a non-volatile
printf("Modified a non-volatile variable 100m times. "
"Time used: %.2f seconds\n",
(double)(clock() - t)/CLOCKS_PER_SEC);
t = clock();
volatile double vd = 0.0;
for (int n=0; n<10000; ++n)
for (int m=0; m<10000; ++m)
vd += vd*n*m; // reads and writes to a volatile
printf("Modified a volatile variable 100m times. "
"Time used: %.2f seconds\n",
(double)(clock() - t)/CLOCKS_PER_SEC);
}
Possible output:
Modified a non-volatile variable 100m times. Time used: 0.00 seconds
Modified a volatile variable 100m times. Time used: 0.79 seconds
volatile short *ttyport = (volatile short*)TTYPORT_ADDR;
for(int i = 0; i < N; ++i)
*ttyport = a[i]; // *ttyport is an lvalue of type volatile short
The preprocessor supports conditional compilation of parts of a source file. This behavior is controlled by #if, #else, #elif, #ifdef, #ifndef and #endif directives.
Syntax:
#if expression
#ifdef identifier
#ifndef identifier
#elif expression
#else
#endif
Example:
#define ABCD 2
#include <stdio.h>
int main(void)
{
#ifdef ABCD
printf("1: yes\n");
#else
printf("1: no\n");
#endif
#ifndef ABCD
printf("2: no1\n");
#elif ABCD == 2
printf("2: yes\n");
#else
printf("2: no2\n");
#endif
#if !defined(DCBA) && (ABCD < 2*4-3)
printf("3: yes\n");
#endif
}
Output:
1: yes
2: yes
3: yes
Example:
#include <stdio.h>
//make function factory and use it
#define FUNCTION(name, a) int fun_##name(int x) { return (a)*x;}
FUNCTION(quadruple, 4)
FUNCTION(double, 2)
#undef FUNCTION
#define FUNCTION 34
#define OUTPUT(a) puts( #a )
int main(void)
{
printf("quadruple(13): %d\n", fun_quadruple(13) );
printf("double(21): %d\n", fun_double(21) );
printf("%d\n", FUNCTION);
OUTPUT(million); //note the lack of quotes
}
Output:
quadruple(13): 52
double(21): 42
34
million
In function-like macros, a # operator before an identifier in the replacement-list runs the identifier through parameter replacement and encloses the result in quotes, effectively creating a string literal. In addition, the preprocessor adds backslashes to escape the quotes surrounding embedded string literals, if any, and doubles the backslashes within the string as necessary. All leading and trailing whitespace is removed, and any sequence of whitespace in the middle of the text (but not inside embedded string literals) is collapsed to a single space. This operation is called "stringification". If the result of stringification is not a valid string literal, the behavior is undefined.
When # appears before VA_ARGS, the entire expanded VA_ARGS is enclosed in quotes:
#define showlist(...) puts(#__VA_ARGS__)
showlist(); // expands to puts("")
showlist(1, "x", int); // expands to puts("1, \"x\", int")
(since C99)
A ## operator between any two successive identifiers in the replacement-list runs parameter replacement on the two identifiers and then concatenates the result.
-g
Produce debugging information in the operating system's native format (stabs, COFF, XCOFF, or DWARF 2). GDB can work with this debugging information. On most systems that use stabs format, -g enables use of extra debugging information that only GDB can use; this extra information makes debugging work better in GDB but probably makes other debuggers crash or refuse to read the program. If you want to control for certain whether to generate the extra information, use -gstabs+, -gstabs, -gxcoff+, -gxcoff, or -gvms (see below).
...
-ggdb
Produce debugging information for use by GDB. This means to use the most expressive format available (DWARF 2, stabs, or the native format if neither of those are supported), including GDB extensions if at all possible.
-gvmslevel
Request debugging information and also use level to specify how much information. The default level is 2. Level 0 produces no debug information at all. Thus, -g0 negates -g.
....
Level 3 includes extra information, such as all the macro definitions present in the program. Some debuggers support macro expansion when you use -g3.
-g and -ggdb are similar with some slight differences, I read this here:
-g produces debugging information in the OS¹s native format (stabs, COFF, XCOFF, or DWARF 2).
-ggdb produces debugging information specifically intended for gdb.
-ggdb3 produces extra debugging information, for example: including macro definitions.
-ggdb by itself without specifying the level defaults to -ggdb2 (i.e., gdb for level 2).
A function definition (or EXPORTed label in assembler) can also be marked as weak, as can a variable definition. In this case, a weak symbol definition is created in the object file.
A weak definition can be used to resolve any reference to that symbol in the same way as a normal definition. However, if another (non-weak) definition of that symbol exists in the build, the linker will use that definition instead of the weak definition, and not produce an error due to multiply-defined symbols.
A simple or dummy implementation of a function can be provided as a WEAK definition. This allows the software to be built (with defined behaviour) without providing a 'full' implementation of this function, but also allows a full implementation to be provided for some builds if required.
Now for the trick! Both GCC and Clang support __attribute__((flatten)).
Putting it on a function
causes all of its callees to be inlined into it. It’s dead simple.
void do_thing(int input)
{
// this code is not always inlined at the call site
}
__attribute__((flatten)) void hot_code()
{
// the program spends >80% of its runtime in this function
while (condition) {
call_something(); // inlined!
do_thing(y); // inlined!
other_thing(); // also inlined!
}
}
void cool_code()
{
// the program spends <5% of its runtime in this function
...
do_thing(a); // not inlined!
do_thing(b); // not inlined!
do_thing(c); // guess!
}
Note: Functions with __attribute__((noinline))
will not be inlined. The same goes for
functions where the compiler can’t see the body.
__attribute__((flatten))
lets you opt in to the pros of aggressive inlining on a per-function
basis, while protecting the rest of your program from the cons!
With the following declaration
int array[ROW][COLUMN]={0};
I get the array with all zeroes but with the following one
int array[ROW][COLUMN]={1};
I don’t get the array with all one value. The default value is still 0.
Why this behavior and how can I initialize with all 1?
You get this behavior, because int array [ROW][COLUMN] = {1};
does not mean "set all
items to one". Let me try to explain how this works step by step.
The explicit, overly clear way of initializing your array would be like this:
#define ROW 2
#define COLUMN 2
int array [ROW][COLUMN] =
{
{0, 0},
{0, 0}
};
However, C allows you to leave out some of the items in an array (or struct/union). You could for example write:
int array [ROW][COLUMN] =
{
{1, 2}
};
This means, initialize the first elements to 1 and 2, and the rest of the elements "as if they had static storage duration".
There is a rule in C saying that all objects of static storage duration, that are not explicitly initialized by the programmer, must be set to zero.
So in the above example, the first row gets set to 1,2 and the next to 0,0 since we didn't give them any explicit values.
Next, there is a rule in C allowing lax brace style. The first example could as well be written as
int array [ROW][COLUMN] = {0, 0, 0, 0};
although of course this is poor style, it is harder to read and understand. But this rule is convenient, because it allows us to write
int array [ROW][COLUMN] = {0};
which means: "initialize the very first column in the first row to 0, and all other items as if they had static storage duration, ie set them to zero."
therefore, if you attempt
int array [ROW][COLUMN] = {1};
it means "initialize the very first column in the first row to 1 and set all other items to zero".
#include <stdio.h>
#include <stdint.h>
#define BMP280_DATA_UPDATE_FREQ (1000)
#define BMP280_DATA_UPDATE_FREQ_U (1000U)
int main() {
unsigned short prev_val1 = 0xFFFF;
unsigned short cur_val1 = 1000;
printf("%d\n", (cur_val1 - prev_val1));
if((cur_val1 - prev_val1) > BMP280_DATA_UPDATE_FREQ)
printf("Works with BMP280_DATA_UPDATE_FREQ\n");
if((cur_val1 - prev_val1) > BMP280_DATA_UPDATE_FREQ_U)
printf("Works with BMP280_DATA_UPDATE_FREQ_U\n");
return 0;
}
Output:
-64535
Works with BMP280_DATA_UPDATE_FREQ_U
In the above example the type of the macro BMP280_DATA_UPDATE_FREQ also affects the behaviour of the
if
expression thus adding required type information to macro can force the operation to be in
the required type.
In the first if
above, the LHS is calculated as int and type of the macro value is int as well
(C standard). The comparisson becomes -64535 > 1000 which is false.
But in the 2nd case the LHS is calculated as int and RHS is unsigned int thus the comparisson is done in unsigned int. So the int in LHS is casted to unsigned int, ie -64535 ==> 2's Complement of 64535 ==> 1001. 1001 > 1000 -> True.
When a suffix L or UL is not used, the compiler uses the first type that can contain the constant from a list (see details in sC99 standard, clause 6.4.4:5. For a decimal constant, the list is int, long int, long long int).
As a consequence, most of the times, it is not necessary to use the suffix. It does not change the meaning of the program. It does not change the meaning of your example initialization of x for most architectures, although it would if you had chosen a number that could not be represented as a long long. See also codebauer's answer for an example where the U part of the suffix is necessary.
There are a couple of circumstances when the programmer may want to set the type of the constant explicitly. One example is when using a variadic function:
printf("%lld", 1LL); // correct, because 1LL has type long long
printf("%lld", 1); // undefined behavior, because 1 has type int
A common reason to use a suffix is ensuring that the result of a computation doesn't overflow. Two examples are:
long x = 10000L * 4096L;
unsigned long long y = 1ULL << 36;
In both examples, without suffixes, the constants would have type int and the computation would be made as int. In each example this incurs a risk of overflow. Using the suffixes means that the computation will be done in a larger type instead, which has sufficient range for the result.
As Lightness Races in Orbit puts it, the litteral's suffix comes before the assignment. In the two examples above, simply declaring x as long and y as unsigned long long is not enough to prevent the overflow in the computation of the expressions assigned to them.
Another example is the comparison x < 12U where variable x has type int. Without the U suffix, the compiler types the constant 12 as an int, and the comparison is therefore a comparison of signed ints.
int x = -3;
printf("%d\n", x < 12); // prints 1 because it's true that -3 < 12
With the U suffix, the comparison becomes a comparison of unsigned ints. “Usual arithmetic conversions” mean that -3 is converted to a large unsigned int:
printf("%d\n", x < 12U); // prints 0 because (unsigned int)-3 is large
In fact, the type of a constant may even change the result of an arithmetic computation, again because of the way “usual arithmetic conversions” work.
Note that, for decimal constants, the list of types suggested by C99 does not contain unsigned long long. In C90, the list ended with the largest standardized unsigned integer type at the time (which was unsigned long). A consequence was that the meaning of some programs was changed by adding the standard type long long to C99: the same constant that was typed as unsigned long in C90 could now be typed as a signed long long instead. I believe this is the reason why in C99, it was decided not to have unsigned long long in the list of types for decimal constants.
The C standard doesn't mandate any particular way of representing negative signed numbers.
In most implementations that you are likely to encounter, negative signed integers are stored in what is called two's complement. The other major way of storing negative signed numbers is called one's complement.
The two's complement of an N-bit number x is defined as 2^N - x. For example, the two's complement of 8-bit 1 is 2^8 - 1, or 1111 1111. The two's complement of 8-bit 8 is 2^8 - 8, which in binary is 1111 1000. This can also be calculated by flipping the bits of x and adding one. For example:
1 = 0000 0001
~1 = 1111 1110 (1's complement)
~1 + 1 = 1111 1111 (2's complement)
-1 = 1111 1111
21 = 0001 0101
~21 = 1110 1010
~21 + 1 = 1110 1011
-21 = 1110 1011
An easier method to get the negation of a number in two's complement is as follows:
Example:
Example 1 Example 2
00101001 00101100
11010111 11010100
Binary value 0 1 10 11 100 .. ... .. 01111111 10000000 10000001 .. ... 11111110 11111111
Two's complement interpretation 0 1 2 3 4 .. ... .. 127 -128 -127 .. ... -2 -1
Unsigned interpretation 0 1 2 3 4 .. ... .. 127 128 129 .. ... 254 255
Here is the process to convert a negative two's complement number back to decimal:
The one's complement of an N-bit number x is defined as x with all its bits flipped, basically.
1 = 0000 0001
-1 = 1111 1110
21 = 0001 0101
-21 = 1110 1010
Two's complement has several advantages over one's complement. For example, it doesn't have the concept of 'negative zero', which for good reason is confusing to many people. Addition, multiplication and subtraction work the same with signed integers implemented with two's complemented as they do with unsigned integers as well.
This is the easiest to understand, because it works the same as we are used to when dealing with negative decimal values: The first position (bit) represents the sign (0 for positive, 1 for negative), and the other bits represent the number. Although it is easy for us to understand, it is hard for computers to work with, especially when doing arithmetic with negative numbers. In 8-bit signed magnitude, the value 8 is represented as 0 0001000 and -8 as 1 0001000.
When using a decimal constant without any suffixes the type of the decimal constant is the first that can be represented, in order (the current C standard, 6.4.4 Constants p5):
In C,
unsigned int size = 1024*1024*1024*2;
which results a warning "integer overflow in expression..." While
unsigned int size = 2147483648;
results no warning?
Is the right value of the first expression is default as int? Where does it mention in C99 spec?
The type of the first expression is int, since values 1024 and 2 can be represented as int. The computation
of those constants will be done in type int, and the result will overflow. This expression
1024*1024*1024*2
(in the expression 1024 and 2 are of type signed int) produces result that is
of type signed int and this value is too big for signed int .
Assuming INT_MAX equals 2147483647 and LONG_MAX is greater than 2147483647, the type of the second expression is long int, since this value cannot be represented as int, but can be as long int. If INT_MAX equals LONG_MAX equals 2147483647, then the type is long long int.
The result of a subtraction generating a negative number in an unsigned type is well-defined:
[...] A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type. (ISO/IEC 9899:1999 (E) §6.2.5/9)
As you can see, (unsigned)0 - (unsigned)1 equals -1 modulo UINT_MAX+1, or in other words, UINT_MAX.
Example:
the way unsigned subtraction works for uint16_t -
0 - 1000 != 1000
0 - 1000 == -1000 mod (65535 + 1)== 64536
0 - 64535 == -64535 mod (65535 + 1) = 1001
#include <stdio.h>
#include <stdint.h>
int main() {
uint16_t a = 0, b = 1000;
uint16_t c = a - b;
uint16_t a1 = 0, b1 = 64535;
uint16_t c1 = a1 - b1;
printf("%d %d\n", c, c1);
return 0;
}
Output:
64536 1001
Promotion is the process by which values of integer type "smaller" that int/unsigned int are converted either to int or unsigned int. The rules are expressed somewhat strangely (mostly for the benefit of handling adequately char) but ensure that value and sign are conserved.
Few examples of effects of promotion rules:
Example 1) - Why does this give a strange, large integer number and not 255?
unsigned char x = 0;
unsigned char y = 1;
printf("%u\n", x - y);
Example 2) - Why does this give "-1 is larger than 0"?
unsigned int a = 1;
signed int b = -2;
if(a + b > 0)
puts("-1 is larger than 0");
Example 3) - Why does changing the type in the above example to short fix the problem?
unsigned short a = 1;
signed short b = -2;
if(a + b > 0)
puts("-1 is larger than 0"); // will not print
(These examples were intended for a 32 or 64 bit computer with 16 bit short.)
C was designed to implicitly and silently change the integer types of the operands used in expressions. There exist several cases where the language forces the compiler to either change the operands to a larger type, or to change their signedness.
The rationale behind this is to prevent accidental overflows during arithmetic, but also to allow operands with different signedness to co-exist in the same expression.
Unfortunately, the rules for implicit type promotion cause much more harm than good, to the point where they might be one of the biggest flaws in the C language. These rules are often not even known by the average C programmer and therefore causing all manner of very subtle bugs.
Typically you see scenarios where the programmer says "just cast to type x and it works" - but they don't know why. Or such bugs manifest themselves as rare, intermittent phenomenon striking from within seemingly simple and straight-forward code. Implicit promotion is particularly troublesome in code doing bit manipulations, since most bit-wise operators in C come with poorly-defined behavior when given a signed operand.
The integer types in C are char, short, int, long, long long and enum. _Bool/bool is also treated as an integer type when it comes to type promotions.
All integers have a specified conversion rank. C11 6.3.1.1, emphasis mine on the most important parts:
Every integer type has an integer conversion rank defined as follows:
The types from stdint.h sort in here too, with the same rank as whatever type they happen to correspond to on the given system. For example, int32_t has the same rank as int on a 32 bit system.
Further, C11 6.3.1.1 specifies which types that are regarded as the small integer types (not a formal term):
The following may be used in an expression wherever an int or unsigned int may be used:
— An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.
What this somewhat cryptic text means in practice, is that _Bool, char and short (and also int8_t, uint8_t etc) are the "small integer types". These are treated in special ways and subject to implicit promotion, as explained below.
Whenever a small integer type is used in an expression, it is implicitly converted to int which is always signed. This is known as the integer promotions or the integer promotion rule.
Formally, the rule says (C11 6.3.1.1):
If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.
This means that all small integer types, no matter signedness, get implicitly converted to (signed) int when used in most expressions.
This text is often misunderstood as: "all small, signed integer types are converted to signed int and all small, unsigned integer types are converted to unsigned int". This is incorrect. The unsigned part here only means that if we have for example an unsigned short operand, and int happens to have the same size as short on the given system, then the unsigned short operand is converted to unsigned int. As in, nothing of note really happens. But in case short is a smaller type than int, it is always converted to (signed) int, regardless of it the short was signed or unsigned!
The harsh reality caused by the integer promotions means that almost no operation in C can be carried out on small types like char or short. Operations are always carried out on int or larger types.
This might sound like nonsense, but luckily the compiler is allowed to optimize the code. For example, an expression containing two unsigned char operands would get the operands promoted to int and the operation carried out as int. But the compiler is allowed to optimize the expression to actually get carried out as an 8 bit operation, as would be expected. However, here comes the problem: the compiler is not allowed to optimize out the implicit change of signedness caused by the integer promotion. Because there is no way for the compiler to tell if the programmer is purposely relying on implicit promotion to happen, or if it is unintentional.
This is why example 1 in the question fails. Both unsigned char operands are promoted to type int, the operation is carried out on type int, and the result of x - y is of type int. Meaning that we get -1 instead of 255 which might have been expected. The compiler may generate machine code that executes the code with 8 bit instructions instead of int, but it may not optimize out the change of signedness. Meaning that we end up with a negative result, that in turn results in a weird number when printf("%u is invoked. Example 1 could be fixed by casting the result of the operation back to type unsigned char.
With the exception of a few special cases like ++ and sizeof operators, the integer promotions apply to almost all operations in C, no matter if unary, binary (or ternary) operators are used.
Usual arithmetic conversion by which operands of arithmetic operators are converted to a common type. It begins by promoting the operand (to either int or unsigned) if they are of a type smaller than int and then choosing a target type by the following process (for integer types, 6.3.1.8/1) (Listed below as bullets).
Whenever a binary operation (an operation with 2 operands) is done in C, both operands of the operator have to be of the same type. Therefore, in case the operands are of different types, C enforces an implicit conversion of one operand to the type of the other operand. The rules for how this is done are named the usual artihmetic conversions (sometimes informally referred to as "balancing"). These are specified in C11 6.3.18:
(Think of this rule as a long, nested if-else if statement and it might be easier to read :) )
6.3.1.8 Usual arithmetic conversions
Many operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to determine a common real type for the operands and result. For the specified operands, each operand is converted, without change of type domain, to a type whose corresponding real type is the common real type. Unless explicitly stated otherwise, the common real type is also the corresponding real type of the result, whose type domain is the type domain of the operands if they are the same, and complex otherwise. This pattern is called the usual arithmetic conversions:
Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands:
Notable here is that the usual arithmetic conversions apply to both floating point and integer variables. In case of integers, we can also note that the integer promotions are invoked from within the usual arithmetic conversions. And after that, when both operands have at least the rank of int, the operators are balanced to the same type, with the same signedness.
This is the reason why a + b in example 2 gives a strange result. Both operands are integers and they are at least of rank int, so the integer promotions do not apply. The operands are not of the same type - a is unsigned int and b is signed int. Therefore the operator b is temporarily converted to type unsigned int. During this conversion it loses the sign information and ends up as a large value.
The reason why changing type to short in example 3 fixes the problem, is because short is a small integer type. Meaning that both operands are integer promoted to type int which is signed. After integer promotion, both operands have the same type (int), no further conversion is needed. And then the operation can be carried out on a signed type as expected.
unsigned char x = 0;
unsigned char y = 1;
printf("%u\n", x - y);
Ans:
Based on the integer promotion rules both x and y are small integer types, they are type casted implicitly to
int before -
operation. But as the x - y
expression is a temporary result in
printf() and is not assigned to another typed variable, the evaluted result of the expression still stays as
int. And thus
int 0 - int 1 = int 0xFFFFFFFF
and thats why it prints 0xFFFFFFFF.
unsigned int a = 1;
signed int b = -2;
if(a + b > 0)
puts("-1 is larger than 0");
Ans:
As both a and b are having types are having at least the rank of int
, so the integer
promotions do not apply. Based on the usual arithmetic conversions, b is typecasted to unsigned
int. Thus the expression is evaluted as and to unsigned int. In the comparisson operation the value 0 is of
type int, so 0 is also converted to the type unsigned int. Thus, causing a greater value (1 + 0xFFFFFFFE =
0xFFFFFFFF) at the LHS of > operator.
unsigned short a = 1;
signed short b = -2;
if(a + b > 0)
puts("-1 is larger than 0"); // will not print
Ans:
As both a and b are having types with rank less than rank of int
the integer promotions
do apply.
Both a and b are converted to int before operation. Thus the expression is evaluted as and to int. Causing a
lesser value (1 + -2 = -1) at the LHS of > operator.
Example:
int a;
int b;
no convertion
Example:
long a;
long long b;
a is converted to long long
Example:
unsigned int a;
int b;
b will be converted to unsigned int. Same example as example 2 of just above topic
Example:
unsigned int a; // range: 0 to 4294967295
long b; // range: -9223372036854775808 to 9223372036854775807
a will be converted to long, as long can represent all of the values of the type of the operand with unsigned integer type, if int is 32 bits and long is 64 bits.
Example:
typically long is 64-bit, and long long is 64-bit.
unsigned long c; // range: 0 to 18446744073709551615
long long d; // range: -9223372036854775808 to 9223372036854775807
Both c and d are converted to unsigned long long as all values of unsigned long type cannot be represented using long long, because both of them are 64 bits for a system.
Regarding what happens during a promotion / conversion on the bit level, let's first assume that the lower rank type is smaller than the higher rank type, and that signed types use 2's complement representation.
For a conversion from a 32 bit int to a 64 bit long, if the value is positive, 4 bytes containing all 0 bits are added on the left. If the value is negative, 4 bytes containing all 1 bits are added on the left. For example, the representation of value 5 changes from 0x00000005 to 0x0000000000000005. For the value -5, the representation changes from 0xfffffffb to 0xfffffffffffffffb.
#include <stdio.h>
#include <stdint.h>
#define BMP280_DATA_UPDATE_FREQ (1000)
#define BMP280_DATA_UPDATE_FREQ__U (1000U)
int main() {
uint32_t prev_val = 0xFFFFFFFF;
uint32_t cur_val = 1000;
/* If an int can represent all values of the original type
(as restricted by the width, for a bit-field),
the value is converted to an int; otherwise,
it is converted to an unsigned int.
These are called the integer promotions. */
/* Here the below expression is evaluated as unisigned int (it int is 32 bits)
itself as all values of uint32_t cannot be represented using int. */
uint32_t ress = cur_val - prev_val;
printf("%u\n", ress);
printf("%u\n", cur_val - prev_val);
printf("%u\n", sizeof(unsigned int) == sizeof(uint32_t));
/* If an int can represent all values of the original type
(as restricted by the width, for a bit-field),
the value is converted to an int; otherwise,
it is converted to an unsigned int.
These are called the integer promotions.
The unsigned part here only means that if we have for example an
unsigned short operand, and int happens to have the same size
as short on the given system, then the unsigned short operand
is converted to unsigned int */
/* Thus the (cur_val - prev_val) evaluates to an unsigned int (as uint32
and int are of same size, refer above para.) and the 'if' becomes true, even if
BMP280_DATA_UPDATE_FREQ doesnt have 'U' at the end. BMP280_DATA_UPDATE_FREQ will
be cast to unsigned int based on usual arithemetic conversions. */
if((cur_val - prev_val) > BMP280_DATA_UPDATE_FREQ)
printf("Works with BMP280_DATA_UPDATE_FREQ\n");
uint16_t prev_val1 = 0xFFFF;
uint16_t cur_val1 = 1000;
int inter_res = cur_val1 - prev_val1;
printf("%X %d\n", inter_res, inter_res);
/* inter_res = FFFF03E9, this value is -64535 (obtained by
taking 2s compliment of FFFF03E9 and adding sign) */
/* -64535 cannot be represeted in uint16_t its reduced by modulo
the number that is one greater than the largest value that can
be represented by the resulting type, ie UINT16_T_MAX + 1
==> -64535 mod (65535 + 1) = 1001 */
uint16_t ress1 = inter_res;
uint16_t ress2 = cur_val1 - prev_val1;
printf("%u %u\n", ress1, ress2);
/* Here the (cur_val - prev_val) evaluates to an int (integer promotion)
and the 'if' becomes false, as BMP280_DATA_UPDATE_FREQ is also int (refer next line),
no additional casting is done.
(When using a decimal
constant without any suffixes the type of the decimal constant
is the first that can be represented, in order
(the current C standard, 6.4.4 Constants p5):
int
long int
long long int)
*/
if((cur_val1 - prev_val1) > BMP280_DATA_UPDATE_FREQ)
printf("Works\n");
/* Here the (cur_val - prev_val) evaluates to an int (integer promotion)
and the 'if' becomes true, becuase as the BMP280_DATA_UPDATE_FREQ__U is unsigned
(cur_val - prev_val) also be casted to unisgned int. based on usual arithemetic conversions. */
if((cur_val1 - prev_val1) > BMP280_DATA_UPDATE_FREQ__U)
printf("Works with BMP280_DATA_UPDATE_FREQ__U\n");
return 0;
}
Output:
1001
1001
1
Works with BMP280_DATA_UPDATE_FREQ
FFFF03E9 -64535
1001 1001
Works with BMP280_DATA_UPDATE_FREQ__U
float f = 0.7;
if( f == 0.7 )
printf("equal");
else
printf("not equal");
Why is the output not equal ?
This happens because in your statement
if(f == 0.7)
the 0.7 is treated as a double.
you should never test for exact equality of floating-point values.
More elaborate explanation:
In the line float f = 0.7;
the double value 0.7 is converted to float first. Then in
if(f == 0.7)
f is converted to double as RHS 0.7 is also double, based on usual
arithemetic conversion rules (refer above).
Note that 0.7 is not representable exactly either as a float (or as a double). If it was represented exactly, then there would be no loss of information when converting to float and then back to double, and you wouldn't have this problem.
All non-integer numbers that can be represented exactly have 5 as their last decimal digit. Unfortunately, the converse is not true: some numbers have 5 as their last decimal digit and cannot be represented exactly.
The important factors under consideration with float or double numbers are: Precision & Rounding
The precision of a floating point number is how many digits it can represent without losing any information it contains.
Consider the fraction 1/3. The decimal representation of this number is 0.33333333333333… with 3′s going out to infinity. An infinite length number would require infinite memory to be depicted with exact precision, but float or double data types typically only have 4 or 8 bytes. Thus Floating point & double numbers can only store a certain number of digits, and the rest are bound to get lost. Thus, there is no definite accurate way of representing float or double numbers with numbers that require more precision than the variables can hold.
There is a non-obvious differences between binary and decimal (base 10) numbers. Consider the fraction 1/10. In decimal, this can be easily represented as 0.1, and 0.1 can be thought of as an easily representable number. However, in binary, 0.1 is represented by the infinite sequence: 0.00011001100110011…
An example:
#include <iomanip>
int main()
{
using namespace std;
cout << setprecision(17);
double dValue = 0.1;
cout << dValue << endl;
}
This output is:
0.10000000000000001
And not
0.1.
This is because the double had to truncate the approximation due to it’s limited memory, which results in a number that is not exactly 0.1. Such an scenario is called a Rounding error.
Whenever comparing two close float and double numbers such rounding errors kick in and eventually the comparison yields incorrect results and this is the reason you should never compare floating point numbers or double using ==.
The best you can do is to take their difference and check if it is less than an epsilon.
abs(x - y) < epsilon
#include <stdio.h>
int main()
{
float f = 0.5;
double f1 = f;
if( f == 0.5 ) // 0.5 is double
printf("equal\n");
else
printf("not equal\n");
if( f1 == 0.5 ) // 0.5 is double, wont work with numbers that doesnt end with 5
printf("equal\n");
else
printf("not equal\n");
}
Thread-local storage (TLS) is a mechanism by which variables are allocated such that there is one instance of the variable per extant thread. The runtime model GCC uses to implement this originates in the IA-64 processor-specific ABI, but has since been migrated to other processors as well. I
At the user level, the extension is visible with a new storage class keyword: __thread. For example:
__thread int i;
extern __thread struct state s;
static __thread char *p;
The thread specifier may be used alone, with the extern or static specifiers, but with no other storage class specifier. When used with extern or static, thread must appear immediately after the other storage class specifier.
In C++, if an initializer is present for a thread-local variable, it must be a constant-expression, as defined in 5.19.2 of the ANSI/ISO C++ standard.
Example:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <pthread.h>
void *test (void *arg)
{
static __thread int val = 0;
static __thread char *string = NULL;
string = (char *) calloc (100, sizeof (char));
strcpy (string, "hello");
val++;
printf ("val(%p):%d\n", &val, val);
printf ("string(%p):%s\n", &string, string);
pthread_exit (NULL);
}
int main (int argc, char *argv[])
{
int num_threads = 3, i;
pthread_t tid[num_threads];
for (i=0;i<num_threads;i++) {
pthread_create (&tid[i], NULL, &test, NULL);
}
for (i=0;i<num_threads;i++) {
pthread_join (tid[i], NULL);
}
return 0;
}
Output with __thread after static storage class specifier:
val(0xf6d8cb38):1 string(0xf6d8cb3c):hello val(0xf758db38):1 string(0xf758db3c):hello val(0xf7d8eb38):1 string(0xf7d8eb3c):hello
Output without __thread:
val(0xf7cd234c):1 string(0xf7cd2348):hello val(0xf6cd034c):1 string(0xf6cd0348):hello val(0xf74d134c):1 string(0xf74d1348):hello
string var has different address because its allocated by calloc which produce different mem mapping for each call.