C# Types and Literals

Type Concepts, C# Pre-Defined Types & Literals

C# is statically typed. This means types for all values must be known at compile time. Types determine the storage, behaviour and attributes of values, so every value has a type. Literals are simply constant values, and as such must also have a type. Some .NET types are so com­mon, that C# provides synonyms (aliases) for them.

PREREQUISITES — You should already…
  • have some programming experience, preferably in C/C++/D/Objective-C/Java/Pascal;
  • be conversant with the fundamentals of Object-Oriented Programming (OOP);
  • understand the role of types in a statically typed language;
  • understand expressions, operators and precedence in general.

CONVENTIONS

To keep code short, we may not always provide the full source for a syntactically complete file. Code snippets will always assume that the following preamble is at the top of the source file:

using System;
using System.Linq;
using System.Collections.Generic;
using static System.Console;

Executable statements will assume they are within some function body block, like Main. If on­ly the definition of a function is provided, you must ensure that you use it inside a class. All code is assumed to be in the same namespace unless explicitly specified.

In this document, wherever we refer to any type in general it will notated as: type — you can just replace that with any known type. When we use ref-type, it must be constrained to reference types, while val-type must be limited to value types.

We use ident as placeholder for identifier, in other words, names of some lang­uage el­e­ments. The following are also identifiers: property, function, variable.

For access control on members of a class, you can use:
private — only code in the class has permission to access it;
public — all code in the program has permission to access it;
protected — only code in the class and code in derived classes have permission;
internal — all code in the assembly has permission to access it; or
internal protected — like internal, but derived classes also have permission.

Fundamental Concepts

Programming languages aim to simplify the translation of human processes to machine code; whether that is native machine code that the processor can directly execute, or whether it is in the form of some intermediate language, as in .NET languages, which use CIL/IL (Com­mon In­ter­me­di­ate Lang­uage). This translation is sometimes called ‘compilation’.

Overview of Major Concepts

Either way, one of the mechanisms employed by a programming language to ease the burden on the programmer, is the abstract concept of a type. If this concept can be applied to every value, the compiler can determine from it the size of memory required for the object — a task which is the responsibility of assembler programmers; tedious and error-prone. Furthermore, the com­pi­ler can ensure that operations on a value are supported by its type.

Purpose of Types

Since the compiler knows the types of all values at compile time, it can create op­e­ra­tions op­ti­mis­ed for the types involved. Addition of the different numeric types involves different mach­ine instructions (or intermediate language instructions). If the types are not known, the com­pil­er must dy­na­mi­cal­ly (at run­time) first check the type, then jump to the appropriate functionality. This is not efficient at all.

Ultimately then, the purpose of abstracting the concept of a type is twofold:

Ultimately the purpose of types is irrelevant: since types govern everything that occurs in a program, you cannot escape them, so you may as well learn to love them.

Moving/Copying Values

When a data value must move, the type of the value determines how many bytes must be copied. In computer terminology, move means copy. The number of bytes in the des­ti­na­tion of the copy, must match the source data size. That will be true if the source and destination types are the same. If they are not the same type, the compiler will sometimes automatically convert (or ‘cast’) the source type to the destination type, but only if the destination type is larger than the source type. Data movement thus involves type conversion rules, and data movement occurs in the following situations:

This is a fundamental and consistent concept. We mention it here because each of the above cases is often seen or treated as a separate issue.

Memory Space

A type may represent something simple, like a number. The type will then define the size of the number (how many bytes a value of such a type occupies in memory). Different types of numbers have different characteristics. For example, integer (or integral) values cannot re­pre­sent de­ci­mals (frac­tions), but real or floating-point type values can. Integer values come in different sizes (number of bytes)… determined by, of course, their types.

In C#, as convenience, you can use the sizeof(type) operator to determine the number of bytes some type will occupy in memory. This is seldom necessary, but does illustrate that types determine the space objects occupy. The sizeof(type) operator is limited to unmanaged (na­tive / non-CTS) types and a number of specific .NET types.

Sizes of some integer types
public static void Main () {
   WriteLine("sizeof(byte) = {0} byte ", sizeof(byte));   //⇒ … = 1 byte
   WriteLine("sizeof(int)  = {0} bytes", sizeof(int));    //⇒ … = 4 bytes
   WriteLine("sizeof(lv)   = {0} bytes", sizeof(long));   //⇒ … = 8 bytes
   }

Memory management is removed from the programmer's responsibilities. Code depends on the .NET memory manager found in System.GC (Garbage Collector). Only advanced programs will ever directly interact with the Garbage Collector. In particular, programs will only use the new  type’ operator to allocate memory, and never explicitly release the memory.

TIPReleasing Memory

A program cannot directly release memory it no longer needs. Since all non-trivial memory will be allocated with new, and thus be reference types, you can assign null (a keyword) to the variable holding the reference. This will reduce the number of references to the memory, and if the reference count becomes 0, the Garbage Collector will automatically reclaim the memory (though not necessarily immediately). This is a good convention for larger objects, and the best you can do in general. You can call GC.Collect, but there is no guarantee.

Some reference types manage resources like files, printers, devices, network connections, data­base con­nec­tions, etc. They will all implement the IDisposable interface, which in turn means they will all have a Dispose method. The overloaded Finalize method will call Dispose, if your code did not call it, but it really should.

IMPORTANTReleasing Resources

For all types that implement the IDisposable interface, it is your responsibility to call the Dispose method it represents, as soon as your code is done with the resource. If you do not, it will negatively affect performance, and other programs may be blocked from accessing the resource for as long as your program is hanging on to it. Some types may have a Close method, which will call Dispose (but it still implements IDisposable, which is your clue).

Characteristics

The range of values is not the same for all types. This is determined in part by memory space, but even for different integer types which occupy the same amount of memory, the range of val­ues will not necessarily be the same.

Based on the type, you may be able to perform arithmetic on it.

In the .NET Framework, and thus also in C#, types are divided firstly into two categories, which determine a type's characteristics: Reference Types and Value Types.

Integer/integral types have signed and unsigned variants. For example, int (System.Int32), is signed, which means it can represent negative and positive values within the range –231⋯ 231–1. On the other hand, uint (System.UInt32) is also a 4-byte integer type, but represents only pos­i­tive values in the range: 0⋯232-1.

TIPDefault Numeric Types

The default integer type in C# is int. The default floating point type is double. These should be the two types you use most of the time. The others are all special types, which are used in special situations. As an additional point: you should never mix signed and unsigned integer types in expressions.

Behaviour

Some characteristics determine behaviour, but we use it here to explain a concept: all types will expose a selection of constructors, methods, operators, properties and so forth. These are col­lec­tive­ly called members, and if accessible, allow code to interact with an object. Loosely speak­ing, they provide an interface to the class, but we will never again use the term ‘interface’ in this context, simply to avoid confusion with the C# interface keyword, and its formal abstraction in the .NET Framework.

Some members are defined with static, and must be accessed via the class name in all code out­side the class. Other members are defined with const, and they too, must be accessed via the class name.

Operations

The low-level mechanics of arithmetic on different types of numbers (integers, unsigned values, floating point values, currency values), are all different. However, the programmer only has to use the arithmetic operators in a ‘natural way’: a + b * c, without regard to the low-level code, as long as the types of a, b and c are valid with respect to the operators.

The compiler will therefore, guided by the types of the operator operands, generate the machine code appropriate for the type. When you mix numeric types in arithmetic, the compiler will con­vert the operands with the smallest ranges to the operand with the highest range.

Most operations on values are performed with operators. This is crucial, and unlike other non-C-like languages. Even assignment is an operator, and not a special kind of ‘assignment statement’. Every operator has rules regarding what the types of its operands may be. Some operators only work with integral values, for example.

Depending on the operator, the result of the operation may not be the same type as any of its operands. For example, all the relational (comparison) operators produce values of type bool (System.Boolean), regardless of the types of the expressions you are comparing.

The new, is, as and cast operators, all work with a type name as at least one operand. But other than that, types are not part of expressions. They are, however, involved implicitly at all times, since every value has a type. Every variable has a type. Every result has a type. Every expression, thus, has a type.

Existing types are mostly used for declarations. More specifically, they are used as part of the syn­tax pattern for defining variables, constants, and methods. These all have names, which are formally called identifiers. And that is how you will ‘work’ with types for the most part, until you know enough C# to create your own types, known as ‘user-defined types’.

Type Inference

The var keyword can used inside functions/methods as an alternative to an explicit type. Such variables must be explicitly initialised with an expression. The var keyword is effective replaced with the type of the expression at compile time.

This effect is called type inference — the type of the variable is inferred from the type of the initialising expression.

This can also be used when defining variables in the parentheses of for and foreach statements.

It is also often used for variables storing the results of LINQ expressions.

Major Type Categories

As mentioned above, in the .NET Framework and thus in C#, types are divided firstly into two categories, which determine a type's characteristics:

Reference Types / Class Types

This is the most common category, with the most sub-types. These are more commonly referred to as ‘class types’, since they are created with the class keyword in C#. Reference types always have System.Object as their ultimate base class (but not necessarily their direct ancestor).

Values of reference types are always only accessed via references, which is an abstraction for ‘addresses’ or ‘pointers’. This is implicit and transparent. Variables and parameters of a ref­e­ren­ce type thus only ever contain a reference, which points to, or refers to, the actual memory re­pre­sen­ting the object. A copy of a reference does not copy the object. So there will be space for the variable (space enough to store a reference) and space for the object.

Value Types / Struct Types

This is a small category without many types, but they are crucial from a performance perspective, which is exactly why they exist. All numeric types, for example, are value types, created with struct in C#, and have System.ValueType as their immediate base class.

A variable of a value type stores the actual value. There is no further memory in play. Assigning, or passing as argument such a value, makes copies of the actual value.

Specialised Type Categories

Although fundamentally we have class types (reference types), and struct types (value types), there are a number of specialised types. Some of these are reference types, and some of them are value types. They are all created with different keywords though, which characterise their specialities.

Fundamental Types

Boolean values, integers, floating point values, currency (decimal) values, characters and strings are all rather ubiquitous and can be considered ‘fundamental’ — this is not a formal spe­ci­fi­ca­tion. The numeric types are all value types. So is char (System.Char), while string and object are reference types.

Enumerated Types

Enumerated types are created with the enum keyword. This is used to create abstract constant val­ues of the same type, which in memory representation, share the same space as integer types; specifically, and by default, int. They are convertible to/from integral types.

Interface Types

Interface types are reference types created with the interface keyword. An interface represents a list of methods (including properties and indexers), that a class may choose to ‘im­ple­ment’. A class can be ‘cast to an interface’, which means that it can be used in a context where any inter­face it implements, is expected. The cast or type conversion is implicit. By convention, interface type names in the .NET Framework start with 2 capital letters, where the first is always I.

Delegate Types

To abstract functions as references that can be stored and moved around, the delegate keyword can be used to create delegate types. Delegates, together with lambdas (anonymous function ex­pres­sion syntax), allow for some very powerful programming techniques. If you have a pa­ra­me­ter of a de­le­gate type, for example, you can pass it different delegates at different times to alter some logic of the function. This is called the ‘plugin’ or ‘call-back’ technique. There are many other uses, but passing delegates is common. Function names are expressions, and implicitly con­vert­ible to a matching delegate type.

Generic Types

Generic types are similar to template types as found in C++, and serve the same purpose. They are more complicated in implementation, but easier to use in C# compared to C++. They act like algorithms that can be physically copied and adapted for different types. The types, for which they must be specialised, are passed as ‘template arguments’. These generic type names will always end with angle brackets enclosing one or more types.

Using Types in Code

In C# code, type names appear in several places, but are most common in definitions (de­cla­ra­tive statements). But there are some operators which require at least one type as operand. Types are not values. They are more analogous to plans or blueprints — a specification of how to con­struct some­thing, and what features it will have.

In this simple example program we use a number of C# type aliases to define, convert, pass and return values of these various types.

TypeUse.csType Locations
/*!@file  TypeUse.cs
*  @brief Examples of Locations where Types are Used.
*/
using System;
using System.Linq;
using System.Collections.Generic;
using static System.Console;

public class AppTypeUse {

   public static void Main() {       //←`Main` returns `void` type.
      int i = 123;                   //←`i` has type `int`.
      double d = 123.456;            //←`d` has type `double`.
      i = (int)d;                    //←`d`'s value ‘cast to’ `int`.
      i = F(123L);                   //←store `int` returned by `F()`.
      }

   int F (long parm) {               //←`F` returns `int` & takes one
                                     // `long` as parameter.
      return (int)(parm * 0.125);    //←`double` result cast to `int`,
      }                              // before returning the result.

   }//class

As you can see, almost every non-structural line contains a type — either explicitly named, or implied by the value of an expression. Types are absolutely pervasive and at the core of un­der­stand­ing languages like C#. An expression is never ‘just a value’; instead, it is always ‘a value with a type’.

Expressions

Just to remind you: an expression is any arbitrary value, literal, constant or a combination of them interspersed with operators. If operators are present, they are evaluated in precedence order, taking their association with respect to their operands into consideration. Regardless of complexity, once evaluated, only one value remains: the result of the expression. And that resulting value has a type. We simply shorten this story to: ‘every expression has type’.

Object Allocation

The new operator requires a type. It allocates space, and calls a constructor for that type to in­i­tia­lise the newly allocated memory.

Syntax:

   int i = new int (123);            //←not necessary for value types.
   int j = 123;                      //←same effect as above.
   string s = "ABC";                 //←`new` is implicit.
   int[] a = {11, 22, 33};           //←`new` is implicit (shorthand).
   a = new int[]{44, 55, 66};        //←no shorthand possible here.
   s = new string('-', 10);          //←explicit `new`.

The new operator must be called for all reference types, whether it is explicit, via a shorthand syn­tax, or via some function that ‘creates’ objects (sometimes called an ‘object factory’).

Local Variable Definitions

To define a variable inside a function, you have to choose (a) a name (identifier) that is not a keyword, and (b) most importantly, a type for the variable. The optional static modifier will give the variable global lifetime, and it will not lose its value when you return from a function — which is the case for all other local variables.

SyntaxBasic Variable Definitions

  • static type ident;
    Define an uninitialised variable (when local to a function) called ident. It must be assigned a value later before use.

  • static type ident = expr;
    Define a variable ident, and initialise it with an expression, which must have the same type, or be implicitly convertible to type.

  • static type ident1 = expr1, ident2 = expr2,;
    Define multiple variables of the same type. Any may optionally be initialised.

Examples of local variable definitions and scope
public static void Main() {
   int i = 123;
   /* compound statement block:
   */ {
      int j = 456, k, l;
      WriteLine("i={0}, j={1}", i, j);
      k = 77;  l = 88;
      WriteLine("k={0}, l={1}", k, l);
      }
   }

A static variable, if not explicitly initialised, will be given the default value for that type; which for numeric types, is zero.

In C#, variables defined at a lower level (nested level), is not allowed to hide variables with the same names defined in a higher local scope. This is different in behaviour when compared to C/C++ rules in this regard.

Fields/Data Member Definitions

The syntax for data members (fields), including readonly or const members, requires a type and looks similar to local variable definitions. Because members appear at class level, their de­fi­ni­tions may be prefixed with an access specifier. As a good coding convention, you should gen­er­al­ly use private for fields (i.e. only code in the class has access permission).

SyntaxFields

  • access [static] [readonly] type ident;
    Data members that are not readonly or const, are generally given private access. A static field is shared by all objects (only one copy ever exists), and is not that common. A readonly field can only be written to (initialised) by a constructor.

  • access [static] [readonly] type ident = expr;
    Same rules as above apply, except for the ‘…= expr’ part, which is called a ‘field initialiser’.

  • access const type ident = expr;
    Define a symbolic constant. The ‘…= expr’ part is not optional, and must be a constant expression. It is accessed like a static field: ‘class.ident’.

The evaluation of a field initialiser expression is triggered by a constructor call (with new), and is executed before the body of the relevant constructor.

Class data members/field definition examples
class Foo {
   public const int CF = 12;             //← ‘symbolic constant’.
   public static int SF = 34;            //← ‘shared’ field; initialised.
   public readonly int RF = 56;          //← read-only instance field.
   private int IFI = 78;                 //← ‘field initialisation’.
   private int IFU;                      //← most common field syntax.

   Foo () {                              //← constructor to initialise.
      RF = 55;                           //← only constructors can write
                                         //  to `readonly` fields.      
      IFU = 90;                          //← initialise instance field. 
      }

   }//class
// in some function:
   var obj = new Foo();                  //← allocate and initialise.
   Write("{0}\n", Foo.CF);               //← access symbolic `const`ant.
   Write("{0}\n", Foo.SF);               //← access `static` field.
   Write("{0}\n", obj.RF);               //← access `readonly` field.
   Write("{0}\n", obj.IFI);              //← ERROR. no access. ☆
   Write("{0}\n", obj.IFU);              //← ERROR. no access. ☆

If the last two statements appeared in a function inside the class, they would not have been in error. On the other hand, if the members had public access, the code, as is, would compile.

Properties and Indexers

Properties appear to users of the class as if they are variables, which may or may not be write­able. Indexers are specialised properties with no name, but which allow programmers to use sub­script­ing on ob­jects. Properties or indexers may be static (shared by all objects). In the syntax below, prop is really just an identifier.

SyntaxProperties & Indexers

  • access [static] type prop {
    get{return expr; } retrieves value
    set{} optional; gets value as automatic parameter.
    }
    This is the most common type of property. It acts like a member variable: ‘obj.prop = expr;’, as example. Or if static: ‘class.prop = expr’.

  • access [static] type prop {
    get; automatically retrieves backing variable.
    set; optional; automatically sets backing variable.
    }
    This is an automatic property: the compiler automatically creates a backing variable of type, and automatically creates code for set and get to access this variable.

  • access [static] type this[params] {
    get{return expr; } gets param as argument(s)
    set{} getstype valueand params as parameters.
    }
    This is syntax for an indexer and allows you to subscript objects of this class. The param can be any type (normally some key or index). You may have more than one parameter, se­pa­rat­ed by commas.

In all cases, the get and set parts have the same access, but can be explicitly specified if a different access is required. It is common to give the set part private permissions when the access of the property itself is public (which means get remains public).

The contextual keyword: value is a parameter with the type of the property, which is au­to­ma­ti­cal­ly created and passed to the set parts.

Function Parameters

When we design a function to accept expressions as arguments, which we ‘pass’ to the function with the function call operator, we must define the function's parameters. They are, in every res­pect, variables local to that function, with the same lifetime and the same scope. The only dif­fe­ren­ce is the context and use — parameters are initialised by the caller.

In the example syntax below, the focus is on the parameter syntax (shown in bold). Parameters cannot be defined in isolation, however, and are always part of the syntax for a func­tion def­i­ni­tion:

access modifiers ret-type func-name (parameters) {}

A pass-by-reference parameter allows a function to modify the content of a variable passed. Nor­mal­ly, this is not possible, since the default behaviour is pass-by-value, which means a copy of the variable content is passed. The ref or out must appear before the argument when the func­tion is called.

Function Returns

Functions must have a return type as part of the syntax, except for a handful of specialised func­tions (constructors, destructors and overloaded cast operators). A special abstract type called void is available if a function has nothing useful to return. In the example syntax below, func is an identifier that names the function.

Type Conversion / Type Cast

A common synonym for type conversion is ‘cast’ or ‘type cast’. That explains the name of the cast operator; it is a unary prefix operator, and one of the few that requires a type operand.

Implicit & Explicit Conversions

Type conversions can be implicit (automatic), or explicit (use of cast operator). Implicit casts are only performed when it is safe, like converting a smaller numerical type to a larger one, e.g.: converting an int to a long, or a byte to a double. This is sometimes called a promotion (going from a smaller to a larger type).

The reverse (called demotion) is allowed on all numeric types, but the result may not be the original value (as it may not fit in the smaller destination). This effect is called ‘truncation’ (which means ‟to shorten by cutting off”), and does not involve much intelligence. Converting a floating point value explicitly to an integral type, will simply truncate the decimal part (no round­ing takes place, in other words). Use Math.Ceiling, Math.Floor, or Math.Round, for alternative behaviours.

Up Casts & Down Casts

For references types, converting to a base type from a derived type, is called an ‘up cast’ or ‘nar­row­ing conversion’, and is implicit. For value types, this is true only when casting to object, in which case it is called ‘boxing’ and discussed below.

Converting a base class down the inheritance hierarchy (towards derived classes), is sometimes called a ‘down cast’, or a ‘widening conversion’. This is not guaranteed to succeed, and must hence be performed explicitly. If it fails, the cast operator will throw an exception. To avoid an exception, you can use the as op­e­ra­tor: it will simply return null on failure instead. The as op­e­ra­tor can be used to check if a cast will succeed, before attempting a down cast.

Custom Conversion Operators

Classes may overload operators, including cast operators. They have a slightly different syntax compared to other overloadable operators. The implicit and explicit keywords control whe­ther the conversion can be automatically applied by the compiler, or whether the conversion can only be performed explicitly by employing the cast operator.

Custom conversion operators example
class CTYPE {
   private double data_;
   public CTYPE (double data = 0.0) { data_ = data; }

   public static implicit operator double (CTYPE parm) {
      return parm.data_;
      }
   public static explicit operator CTYPE (double parm) {
      return parm.data_;
      }
   }

   CTYPE obj = new CTYPE(12.34);     //←nothing exciting.
   double d = obj;                   //←`CTYPE` implicitly cast.
   obj = (CTYPE)d;                   //←`double` explicitly cast.

The example is trivial, but does illustrate the syntax. This should not be used too often, but programmers should be aware of the possibilities. If in doubt, rather use explicit.

Predefined Types

Unfortunately, swathes of documentation use either the term ‘intrinsic types’, or ‘built-in types’. Both terms are misleading. C# has no built-in types. All types are provided by the .NET Frame­work, where fundamental types and their behaviour are from the CTS (Common Type System). Only in un-managed (native code) sections, or if the whole C# program has been compiled with the ‘/unsafe’ option, do these terms have a semblance of credence, because then C# has to in­ter­face with native C/C++ built-in types.

Even ‘predefined’ may be misconstrued — it might be better to consider the names that follow as aliases, or synonyms, for existing .NET types. Ultimately, they are just optional and succinct alternatives, with no additional meaning or behaviour with respect to the .NET types for which they are shorthand.

Miscellaneous Types

The following aliases have little in common with other types, which is why they are grouped here.

Miscellaneous Predefined C# Types
C# Name .NET Type Description / Range
object System.Object All types inherit from it at some point.
bool System.Boolean true or false.

Although not types in the same sense we describe the others, you should be aware of two type patterns in addition to the above:

bool

Values of type bool can only store true or false. They are C# built-in symbolic con­stants, al­though some documentation will group them with literals. They can be cast to numeric types, in which case false will result in 0, and true in 1.

The logical && (and), || (or) and ! (not) operators return true or false, and expect bool op­e­rands.

The comparison operators, on the other hand, accept any supported types for their operands, but always return bool, which is only reasonable. C#'s iteration and selection statements (barring switch and foreach), expect a bool result for the conditional expression.

Expressions resulting in bool, or requiring bool
   int i = 0;
   if (i == 0) {                     //←must be `true`.
      while (i < 10) {               //←loop while `true`.
         Write("{0} ", i);           //←do some ‘work’.
         ++i;                        //←change condition.
         }
      WriteLine();
      }
   for (int j = 0; j < 10; ++j)      //←more succinct.
      Write("{0} ", j);              //←do some ‘work’.
   WriteLine();

   bool b = i == 10 && true;         //←save result of `i == 10 && true`.
   Write("b = {0}, {1}", b, (int)b); //←output `bool` and `int` values.

   for (i = 0; i < 10; ++i)          //←only print odd numbers.
      if (IsOdd(i))                  //←`IsOdd()` returns `bool`.
         Write("{0} ", i);
   WriteLine();

   public static bool IsOdd (int x) {
      return x % 2 == 1;             //←return `bool` result.
      }

object

All members of object are inherited by all other types. This explains, for example, why all ex­pres­sions of any type have a ToString method. Every type generally overrides it, since it is a virtual. It is a reference type, and from the rules of object-oriented programming, you can as­sign a ref­e­rence of any type to a variable, parameter, or return value of type object.

Boxing & Unboxing

Syntactically, and to conform to OOP ideals, you can copy a value type (val-type) in a location expecting an object. This means the compiler must create a reference, since value type objects do not contain references. This process is called boxing.

To retrieve the boxed value type, it must be unboxed. Syntax-wise, this means you must use the cast operator on the object expression or variable.

Boxing and unboxing examples
   object o;
   int i = 123;                          //← arbitrary value type (`int`).☆
   o = i;                                //← legal, but special = ‘boxing’.
   i = (int)o;                           //← cast to `int` = ‘unboxing’.
   object[] oa = new object[3];          //← an array of `object`s.
   oa[0] = new int(123);                 //← boxing. store ref.
   oa[1] = "ABC";                        //← store new `string` ref.
   oa[2] = null;                         //← `null` can go in any ref.
   WriteLine("oa[0] = {0}", oa[0]);      //← unboxing not necessary.
   WriteLine("oa[1] = {1}", oa[1]);      //← no unboxing.

Obviously, instead of int, any value type could have been used.

Character and String Types

Characters in the .NET Framework (and the Windows operating system) are 2 bytes in size, and use UTF-16 encoding. They are convertible to numeric types and vice versa, but not implicitly. The elements in a string are of type char.

Character and String Types
C# Name .NET Type Description / Range
char System.Char Any character.
string System.String Reference type. Immutable.

Strings are immutable. Any string transformation consequently involves a copy of the original with the modifications applied. This makes them safe to pass to functions, even though you are passing a reference to the actual string data.

TIPEfficient String Concatenation

If you encounter a situation where long strings are created regularly by means of appending, consider using the Text.StringBuilder class. It is very efficient at concatenating strings, and when done, you can simply use ToString to retrieve a ‘proper’ string for further use.

Building strings in lieu of concatenation
   string s1 = "";                   //←initialise with ‘empty string’.
   string s2 = new string('-', 10);  //←initialise with 10 dashes.
   s1 = s2;                          //←both now reference 10 dashes.
   s2 = s2.ToUpper() + "DEF";        //←new upper case & concat.
   char c = s2[1];                   //←subscript returns `char`.
   s2 += "XYZ";                      //←append (inefficiently).
   var sb                            //←workhorse to the rescue.
      = new Text.StringBuilder();
   sb.Append("ABC");                 //←append (efficiently).
   sb.AppendFormat("-{0}-", 123);    //←format & append. 
   s1 = sb.ToString();               //←newly concatenated str.

Strings and characters are really easy to work with. The only caveat ever, is when concatenating long strings many times.

Character Literals

The sequence: 'char' (a character between single quotes), has type char (System.Char), and is a ‘character literal’. The char can be a character on your keyboard, a UTF-8 character, or an escape sequence.

String Literals

The sequence: "‹chars›" is a string literal and has type string (System.String). It is possible to have no chars: "", making it an empty string. Like character literals, a literal string can contain escape sequences. A string literal can be prefixed with the ‘at’ character (@), also called the ‘ver­bat­im char­ac­ter’, in which case this string is called a ‘verbatim string’, where escape sequences have no meaning (interpreted literally).

@ PrefixFurther Uses

The @ prefix not only works for ‘verbatim strings’, but also in front of any identifier. This means that any keyword can be used as an identifier, as long as prefixed with the @ sign — not good practice, or even a good idea.

Since C# 6.0, the syntax supports string interpolation. This is triggered when a literal string is prefixed with the dollar sign ($). If the string then contains expressions enclosed in curly braces, the expressions are evaluated and converted to their string representation (C# reference). This string representation replaces the whole curly-brace placeholder sequence inside the literal string.

String interpolation examples
using static System.Console;

int I = 123;
string S = "ABC";
string X = $"I={I}, S=\"{S}\".";
WriteLine(X);                        //⇒ `I=123, S="ABC".`
WriteLine($"I={I}, S=\"{S}\".");     //⇒ `I=123, S="ABC".`
X = String.Format(                   //←longer alternative.
   "I={0}, S=\"{1}\".", I, S);       //
WriteLine(X);                        //⇒ `I=123, S="ABC".`
WriteLine(                           //⇒ `I=123, S="ABC".`
   "I={0}, S=\"{1}\".", I, S);       //⇒ `I=123, S="ABC".`

As you can see from the above examples, string interpolation can greatly simplify string for­mat­ting. We recommend you use it, as long as you remember this is only legal from C# 6 (Visual Studio 2015, in other words).

String Interpolation

From C#6, literal strings can optionally be prefixed with a dollar sign ($). This enables string in­ter­po­la­tion, which is a con­ve­nient al­ter­na­tive to String.Format. You are encouraged to use this syntax, since it is more concise, more com­pre­hen­si­ble, and less error-prone.

Like String.Format, the syntax utilises matching curly braces do delimit a placeholder. But rather than specifying an offset to the argument to format into the string, string interpolation allows any valid C# expression between the braces. Contrast the following equivalent initialisations:

   string s = $"PI * 2 = {Math.PI * 2.0}!";
   string t = String.Format("PI * 2 = {0}!", Math.PI * 2.0);

It is difficult to argue that the string interpolation version is not more readable, more concise and have less potential for errors.

The expression in the placeholder can be formatted with the same formatting that String.Format uses:

   string s = $"PI * 2 = {Math.PI * 2.0:F4}!";
   string t = String.Format("PI * 2 = {0:F4}!", Math.PI * 2.0);

It is a win-win: Easy syntax, leveraging existing knowledge, providing not only a better experience, but also better programs.

Escape Sequences

An escape sequence can be used as a character in literal characters, or literal strings. An es­cape se­quen­ce starts with a backslash (\), making the backslash an escape character in this con­text. To represent a backslash as an actual character, it must be prefixed with another back­slash: '\\' (literal character), or inside a string literal: "⋯\\⋯".

To represent a single quote as a literal character, it must be escaped: '\'', and to represent a double quote character inside a literal string, it must also be escaped: "⋯\"⋯".

The special sequence: \0 represents the null character. Other special sequences are: \n (new­line), \r (carriage return), \a (bell), \b (backspace), \f (form feed), \t (tab) and \v (vertical tab).

The numerical (hexadecimal) code for a character can be created with the sequence: \x, followed by up to 4 hex digits (upper- or lower- case). We suggest you rather use the other option to re­pre­sent Unicode values: \u or \U, followed by exactly four hex digits (they can be prefixed with 0: \u00A0).

Character Sizes and Encoding

The .NET Framework uses UTF-16 encoding for characters and strings. This means that every character occupies 2 bytes in memory. This is never an issue, just something to be aware of. The .NET Framework has many options for converting to and from other encodings (from the System.Text namespace).

We suggest for text storage and output, you convert to/from UTF-8. This is the default encoding for HTML and XML, for example. Writing to the console is automatically converted to your Console encoding, so you do not have to worry about that. The default text readers will also automatically convert from the input encoding to the internal UTF-16 encoding.

Integral / Integer Types

These are also called ‘integer’ types, but many take this to mean the int type, so we prefer the term ‘integral’ unless the context is clear. (Nevertheless, we will never use ‘integer’ as a syn­o­nym for int in any material we provide.)

Integral/Integer Predefined C# Types
C# Name .NET Type Description / Range
sbyte System.SByte -27⋯+27-1.
byte System.Byte 0⋯+28-1.
short System.Int16 -215⋯+215-1.
ushort System.UInt16 0⋯+216-1.
int System.Int32 -231⋯+231-1. default integer type
uint System.UInt32 0⋯+232-1.
long System.Int64 -263⋯+263-1.
ulong System.UInt64 0⋯+264-1.

All arithmetic operators are available for integral types. When you mix the types of the operands, the smaller type is converted to the largest type in the expression.

The bitwise operators: & (and), | (or), ^ (xor), ~ (not), << (left shift) and >> (right shift) only work with these integral types. The modulus or ‘remainder’ operator (%) also expects only integral operands.

Integer Literals

Integer literals are constant values assumed by default to be in base 10 (decimal), and have type int (System.Int32). Suffixes can change the type, and it is possible to use notations for number bases other than 10: hexadecimal (base 16), octal (base 8) and binary (base 2).

Integer Literal Suffixes

Although the literal: 123 has by default type int, by adding the L suffix: 123L, it now has type long (System.Int64). You can use the cast operator as well: (long)123, but that is an expression containing an operator and an int literal. Also: 123U has type uint (System.UInt32), and 123UL, has type ulong (System.UInt64).

Integer Literal Base Notation

To change the notation to hexadecimal, prefix a sequence of hex digits in the range 0…9,A…F with 0x (the x can be upper case): 0x123 is hexadecimal value 123, or 291 in decimal. You can use lower case hex digits a…f as well.

The base notation does not change the type, so the value 0x123 still has type int, while 0x123UL has type ulong, as an example. For octal notation, simply start the literal with 0, so 0123 is in base 8 (octal), with decimal value (base 10): 83. Avoid octal.

C# 7Numeric Literal Enhancements

From C# 7, you can prefix 0b for base 2 (binary) literals: 0b01111011 is 123 in decimal. Also from C# 7, underscores can group digits in numeric literals: 0b0111_1011, or 1_234_567 are both legal literals (of type int).

Real / Floating Point Types

Real/Floating Point Predefined C# Types
C# Name .NET Type Description / Range
float System.Single 32-bits.
double System.Double 64-bits. Default floating point type
decimal System.Decimal Currency. Less range, but higher precision.

Floating Point Literals

A numeric literal which contains a decimal point is called a ‘floating point literal’. Instead of the term ‘floating point’, some documentation may use the term ‘real’. The default type of a floating pointer literal, e.g. 123.456, is double (System.Double).

Floating Point Literal Suffixes

Floating point literals can be written in fixed-point notation, or exponential notation (sometimes called ‘scientific notation’): the value 123.456 is equivalent to 1.23456e2, and both have type double. You can legally also suffix a D (for ‘double’), but this is superfluous.

The F suffix will cause the type to be float (System.Single). You should use float only when forc­ed to. Please avoid it as much as you can.

The M suffix (think: Money) will change the literal's type to decimal (System.Decimal): 12.3456M. This is the appropriate type to use when you are working with currency values.

As noted before, in C# 7, you can use underscores to group digits. This also applies to floating point literals: 1_234.567_890.

Introspection / Reflection

If you were to inspect the methods of object, you would notice there is an instance method cal­led GetType, which returns a value with the name: Type. If you were to look at the properties and methods of Type, you may be surprised at the number of them.

Rummaging around, you may come across the related TypeInfo class, and notice that there is a relationship between many classes leading to the Reflection namespace. The term ‘reflection’ occurs in programming languages where it is possible to internally inspect any object (remember GetType is available on all types of objects).

This topic is generally quite advanced, so we present it here more as a concept — all roads lead to Type. You can easily experiment with Type. For example, it has a Name property. So, you can GetType any object, save the return value in a variable of type Type, and print out the type's Name!

The is operator also uses Reflection to determine whether it should return true or false.

Summary

This is not a tutorial, but more of a reference with a few examples of where types are used. Al­though a new C# programmer might not initially be able to deal with all of these topics, they will ev­en­tu­al­ly become required knowledge, as the sophistication of your code, and your experienc increases.

The key lesson to remember, especially in the beginning, is what we reiterated several times: every value has a type. Types govern space, characteristics and behaviour. Be aware of types at all times, and know what types you are working with (in the results of expressions, literals, etc.).


2021-11-29: Fix some Wikipedia links. [brx]
2021-03-31: Update API links to .NET 5.0; and to Wikipedia mobile. [brx]
2020-03-17: Syntax elements; new tables; type inference. [brx]
2019-10-15: Fixed typos in some comments. [brx]
2019-04-24: Fixed string literal error. [brx]
2018-06-14: Fixes & clarifications. [brx]
2018-06-12: Small additions (mainly string interpolation). [brx]
2017-12-01: Editing. [jjc]
2017-11-26: Additional topics. Editing. [brx;jjc]
2017-11-25: Created. [brx]