Overview of the Carbon language (Part 1)

Bayram EKER
10 min readJul 27, 2022

--

carbon-example-logo

Let’s simply look at the code structure

import Console;

// Prints the Fibonacci numbers less than `limit`.
fn Fibonacci(limit: i64) {
var (a: i64, b: i64) = (0, 1);
while (a < limit) {
Console.Print(a, " ");
let next: i64 = a + b;
a = b;
b = next;
}
Console.Print("\n");
}

Carbon is a language that should feel familiar to C++ and C developers.

A feature i really like:

All source code is UTF-8 encoded text. Comments, identifiers, and strings are allowed to have non-ASCII characters.

var résultat: String = “Succès”;

comment line

// Compute an approximation of π

Primitive types

  • Bool - a boolean type with two possible values: True and False.
  • Int and UInt - signed and unsigned 64-bit integer types.
  • Standard sizes are available, both signed and unsigned, including i8, i16, i32, i64, i128, and i256.
  • Overflow in either direction is an error.
  • Float64 - a floating point type with semantics based on IEEE-754.
  • Standard sizes are available, including f16, f32, and f128.
  • BFloat16 is also provided.
  • String - a byte sequence treated as containing UTF-8 encoded text.
  • StringView - a read-only reference to a byte sequence treated as containing UTF-8 encoded text.

Tuples

A tuple is a fixed-size collection of values that can have different types, where each value is identified by its position in the tuple. An example use of tuples is to return multiple values from a function:

fn DoubleBoth(x: i32, y: i32) -> (i32, i32) {
return (2 * x, 2 * y);
}

  • The return type is a tuple of two i32 types.
  • The expression uses tuple syntax to build a tuple of two i32 values.

Struct types

Carbon also has structural types whose members are identified by name instead of position. These are called structural data classes, also known as a struct types or structs.

Both struct types and values are written inside curly braces ({...}). In both cases, they have a comma-separated list of members that start with a period (.) followed by the field name.

  • In a struct type, the field name is followed by a colon (:) and the type, as in: {.name: String, .count: i32}.
  • In a struct value, called a structural data class literal or a struct literal, the field name is followed by an equal sign (=) and the value, as in {.key = "Joe", .count = 3}.

Pointer types

The type of pointers-to-values-of-type-T is written T*. Carbon pointers do not support pointer arithmetic; the only pointer operations are:

  • Dereference: given a pointer p, *p gives the value p points to as an l-value. p->m is syntactic sugar for (*p).m.
  • Address-of: given an l-value x, &x returns a pointer to x.

Arrays and slices

The type of an array of holding 4 i32 values is written [i32; 4]. There is an implicit conversion from tuples to arrays of the same length as long as every component of the tuple may be implicitly converted to the destination element type. In cases where the size of the array may be deduced, it may be omitted, as in:

var i: i32 = 1;
// `[i32;]` equivalent to `[i32; 3]` here.
var a: [i32;] = (i, i, i);

Expressions

Expressions describe some computed value. The simplest example would be a literal number like 42: an expression that computes the integer value 42.

Expressions are the portions of Carbon syntax that produce values. Because types in Carbon are values, this includes anywhere that a type is specified.

fn Foo(a: i32*) -> i32 {
return *a;
}

Here, the parameter type i32*, the return type i32, and the operand *a of the return statement are all expressions.

Declarations, Definitions, and Scopes

Name-binding declarations:

There are two kinds of name-binding declarations:

  • constant declarations, introduced with let, and
  • variable declarations, introduced with var.

There are no forward declarations of these; all name-binding declarations are definitions.

Constant let declarations:

A let declaration matches an irrefutable pattern to a value. In this example, the name x is bound to the value 42 with type i64:

let x: i64 = 42;

Here x: i64 is the pattern, which is followed by an equal sign (=) and the value to match, 42. The names from binding patterns are introduced into the enclosing scope.

Variable var declarations:

A var declaration is similar, except with var bindings, so x here is an l-value with storage and an address, and so may be modified:

var x: i64 = 42;
x = 7;

Variables with a type that has an unformed state do not need to be initialized in the variable declaration, but do need to be assigned before they are used.

auto

If auto is used as the type in a var or let declaration, the type is the static type of the initializer expression, which is required.

Functions

A basic function definition may look like:

fn Add(a: i64, b: i64) -> i64 {
return a + b;
}

This declares a function called Add which accepts two i64 parameters, the first called a and the second called b, and returns an i64 result. It returns the result of adding the two arguments.

C++ might declare the same thing:

std::int64_t Add(std::int64_t a, std::int64_t b) {
return a + b;
}
// Or with trailing return type syntax:
auto Add(std::int64_t a, std::int64_t b) -> std::int64_t {
return a + b;
}

Return clause:

The return clause of a function specifies the return type using one of three possible syntaxes:

  • -> followed by an expression, such as i64, directly states the return type. This expression will be evaluated at compile-time, so must be valid in that context.
  • For example, fn ToString(val: i64) -> String; has a return type of String.
  • -> followed by the auto keyword indicates that type inference should be used to determine the return type.
  • For example, fn Echo(val: i64) -> auto { return val; } will have a return type of i64 through type inference.
  • Declarations must have a known return type, so auto is not valid.
  • The function must have precisely one return statement. That return statement's expression will then be used for type inference.
  • Omission indicates that the return type is the empty tuple, ().
  • For example, fn Sleep(seconds: i64); is similar to fn Sleep(seconds: i64) -> ();.
  • () is similar to a void return type in C++.

return statements:

The return statement is essential to function control flow. It ends the flow of the function and returns execution to the caller.

When the return clause is omitted, the return statement has no expression argument, and function control flow implicitly ends after the last statement in the function's body as if return; were present.

When the return clause is provided, including when it is -> (), the return statement must have an expression that is convertible to the return type, and a return statement must be used to end control flow of the function.

Function declarations:

Functions may be declared separate from the definition by providing only a signature, with no body. This provides an API which may be called. For example:

// Declaration:
fn Add(a: i64, b: i64) -> i64;
// Definition:
fn Add(a: i64, b: i64) -> i64 {
return a + b;
}

The corresponding definition may be provided later in the same file or, when the declaration is in an api file of a library, in the impl file of the same library. The signature of a function declaration must match the corresponding definition. This includes the return clause; even though an omitted return type has equivalent behavior to -> (), the presence or omission must match.

Function calls:

Function calls use a function’s identifier to pass multiple expression arguments corresponding to the function signature’s parameters. For example:

fn Add(a: i64, b: i64) -> i64 {
return a + b;
}
fn Run() {
Add(1, 2);
}

Here, Add(1, 2) is a function call expression. Add refers to the function definition's identifier. The parenthesized arguments 1 and 2 are passed to the a and b parameters of Add.

Parameters

The bindings in the parameter list default to let bindings, and so the parameter names are treated as r-values. This is appropriate for input parameters. This binding will be implemented using a pointer, unless it is legal to copy and copying is cheaper.

If the var keyword is added before the binding, then the arguments will be copied (or moved from a temporary) to new storage, and so can be mutated in the function body. The copy ensures that any mutations will not be visible to the caller.

Use a pointer parameter type to represent an input/output parameter, allowing a function to modify a variable of the caller’s. This makes the possibility of those modifications visible: by taking the address using & in the caller, and dereferencing using * in the callee.

Outputs of a function should prefer to be returned. Multiple values may be returned using a tuple or struct type.

auto return type:

If auto is used in place of the return type, the return type of the function is inferred from the function body. It is set to common type of the static type of arguments to the return statements in the function. This is not allowed in a forward declaration.

// Return type is inferred to be `bool`, the type of `a > 0`.
fn Positive(a: i64) -> auto {
return a > 0;
}

Blocks and statements

A block is a sequence of statements. A block defines a scope and, like other scopes, is enclosed in curly braces ({...}). Each statement is terminated by a semicolon or block. Expressions and var and let are valid statements.

The body of a function is defined by a block, and some control-flow statements have their own blocks of code. These are nested within the enclosing scope. For example, here is a function definition with a block of statements defining the body of the function, and a nested block as part of a while statement:

fn Foo() {
Bar();
while (Baz()) {
Quux();
}
}

Assignment statements

Assignment statements mutate the value of the l-value described on the left-hand side of the assignment.

  • Assignment: x = y;. x is assigned the value of y.
  • Increment and decrement: ++i;, --j;. i is set to i + 1, j is set to j - 1.
  • Compound assignment: x += y;, x -= y;, x *= y;, x /= y;, x &= y;, x |= y;, x ^= y;, x <<= y;, x >>= y;. x @= y; is equivalent to x = x @ y; for each operator @.

Unlike C++, these assignments are statements, not expressions, and don’t return a value.

Control flow

Blocks of statements are generally executed sequentially. Control-flow statements give additional control over the flow of execution and which statements are executed.

Some control-flow statements include blocks. Those blocks will always be within curly braces {...}.

// Curly braces { ... } are required.
if (condition) {
ExecutedWhenTrue();
} else {
ExecutedWhenFalse();
}

if and else

if and else provide conditional execution of statements. Syntax is:

if (boolean expression ) { statements }

[ else if ( boolean expression ) { statements } ] ...

[ else { statements } ]

Only one group of statements will execute:

  • When the first if's boolean expression evaluates to true, its associated statements will execute.
  • When earlier boolean expressions evaluate to false and an else if's boolean expression evaluates to true, its associated statements will execute.
  • ... else if ... is equivalent to ... else { if ... }, but without visible nesting of braces.
  • When all boolean expressions evaluate to false, the else's associated statements will execute.

When a boolean expression evaluates to true, no later boolean expressions will evaluate.

Note that else if may be repeated.

For example:

if (fruit.IsYellow()) {
Print("Banana!");
} else if (fruit.IsOrange()) {
Print("Orange!");
} else if (fruit.IsGreen()) {
Print("Apple!");
} else {
Print("Vegetable!");
}
fruit.Eat();

This code will:

  • Evaluate fruit.IsYellow():
  • When True, print Banana! and resume execution at fruit.Eat().
  • When False, evaluate fruit.IsOrange():
  • When True, print Orange! and resume execution at fruit.Eat().
  • When False, evaluate fruit.IsGreen():
  • When True, print Orange! and resume execution at fruit.Eat().
  • When False, print Vegetable! and resume execution at fruit.Eat().

while

while statements loop for as long as the passed expression returns True. Syntax is:

while ( boolean expression ) { statements }

For example, this prints 0, 1, 2, then Done!:

var x: Int = 0;
while (x < 3) {
Print(x);
++x;
}
Print("Done!");

for

for statements support range-based looping, typically over containers. Syntax is:

for ( var declaration in expression ) { statements }

For example, this prints all names in names:

for (var name: String in names) {
Print(name);
}

PrintNames() prints each String in the names List in iteration order.

break

The break statement immediately ends a while or for loop. Execution will resume at the end of the loop's scope. Syntax is:

break;

For example, this processes steps until a manual step is hit (if no manual step is hit, all steps are processed):

for (var step: Step in steps) {
if (step.IsManual()) {
Print("Reached manual step!");
break;
}
step.Process();
}

continue

The continue statement immediately goes to the next loop of a while or for. In a while, execution continues with the while expression. Syntax is:

continue;

For example, this prints all non-empty lines of a file, using continue to skip empty lines:

var f: File = OpenFile(path);
while (!f.EOF()) {
var line: String = f.ReadLine();
if (line.IsEmpty()) {
continue;
}
Print(line);
}

match

match is a control flow similar to switch of C and C++ and mirrors similar constructs in other languages, such as Swift. The match keyword is followed by an expression in parentheses, whose value is matched against the case declarations, each of which contains a refutable pattern, in order. The refutable pattern may optionally be followed by an if expression, which may use the names from bindings in the pattern.

The code for the first matching case is executed. An optional default block may be placed after the case declarations, it will be executed if none of the case declarations match.

An example match is:

fn Bar() -> (i32, (f32, f32));fn Foo() -> f32 {
match (Bar()) {
case (42, (x: f32, y: f32)) => {
return x - y;
}
case (p: i32, (x: f32, _: f32)) if (p < 13) => {
return p * x;
}
case (p: i32, _: auto) if (p > 3) => {
return p * Pi;
}
default => {
return Pi;
}
}
}

--

--