Overview of the Carbon language (Part 1)
Let’s simply look at the code structure
import Console;
// Prints the Fibonacci numbers less than `limit`.
fn Fibonacci(limit: i64) {
var (a: i64, b: i64) = (0, 1);
while (a < limit) {
Console.Print(a, " ");
let next: i64 = a + b;
a = b;
b = next;
}
Console.Print("\n");
}
Carbon is a language that should feel familiar to C++ and C developers.
A feature i really like:
All source code is UTF-8 encoded text. Comments, identifiers, and strings are allowed to have non-ASCII characters.
var résultat: String = “Succès”;
comment line
// Compute an approximation of π
Primitive types
Bool
- a boolean type with two possible values:True
andFalse
.Int
andUInt
- signed and unsigned 64-bit integer types.- Standard sizes are available, both signed and unsigned, including
i8
,i16
,i32
,i64
,i128
, andi256
. - Overflow in either direction is an error.
Float64
- a floating point type with semantics based on IEEE-754.- Standard sizes are available, including
f16
,f32
, andf128
. BFloat16
is also provided.String
- a byte sequence treated as containing UTF-8 encoded text.StringView
- a read-only reference to a byte sequence treated as containing UTF-8 encoded text.
Tuples
A tuple is a fixed-size collection of values that can have different types, where each value is identified by its position in the tuple. An example use of tuples is to return multiple values from a function:
fn DoubleBoth(x: i32, y: i32) -> (i32, i32) {
return (2 * x, 2 * y);
}
- The return type is a tuple of two
i32
types. - The expression uses tuple syntax to build a tuple of two
i32
values.
Struct types
Carbon also has structural types whose members are identified by name instead of position. These are called structural data classes, also known as a struct types or structs.
Both struct types and values are written inside curly braces ({
...}
). In both cases, they have a comma-separated list of members that start with a period (.
) followed by the field name.
- In a struct type, the field name is followed by a colon (
:
) and the type, as in:{.name: String, .count: i32}
. - In a struct value, called a structural data class literal or a struct literal, the field name is followed by an equal sign (
=
) and the value, as in{.key = "Joe", .count = 3}
.
Pointer types
The type of pointers-to-values-of-type-
T
is writtenT*
. Carbon pointers do not support pointer arithmetic; the only pointer operations are:
- Dereference: given a pointer
p
,*p
gives the valuep
points to as an l-value.p->m
is syntactic sugar for(*p).m
. - Address-of: given an l-value
x
,&x
returns a pointer tox
.
Arrays and slices
The type of an array of holding 4
i32
values is written[i32; 4]
. There is an implicit conversion from tuples to arrays of the same length as long as every component of the tuple may be implicitly converted to the destination element type. In cases where the size of the array may be deduced, it may be omitted, as in:
var i: i32 = 1;
// `[i32;]` equivalent to `[i32; 3]` here.
var a: [i32;] = (i, i, i);
Expressions
Expressions describe some computed value. The simplest example would be a literal number like
42
: an expression that computes the integer value 42.
Expressions are the portions of Carbon syntax that produce values. Because types in Carbon are values, this includes anywhere that a type is specified.
fn Foo(a: i32*) -> i32 {
return *a;
}
Here, the parameter type i32*
, the return type i32
, and the operand *a
of the return
statement are all expressions.
Declarations, Definitions, and Scopes
Name-binding declarations:
There are two kinds of name-binding declarations:
- constant declarations, introduced with
let
, and - variable declarations, introduced with
var
.
There are no forward declarations of these; all name-binding declarations are definitions.
Constant let
declarations:
A
let
declaration matches an irrefutable pattern to a value. In this example, the namex
is bound to the value42
with typei64
:
let x: i64 = 42;
Here x: i64
is the pattern, which is followed by an equal sign (=
) and the value to match, 42
. The names from binding patterns are introduced into the enclosing scope.
Variable var
declarations:
A
var
declaration is similar, except withvar
bindings, sox
here is an l-value with storage and an address, and so may be modified:
var x: i64 = 42;
x = 7;
Variables with a type that has an unformed state do not need to be initialized in the variable declaration, but do need to be assigned before they are used.
auto
If
auto
is used as the type in avar
orlet
declaration, the type is the static type of the initializer expression, which is required.
Functions
A basic function definition may look like:
fn Add(a: i64, b: i64) -> i64 {
return a + b;
}
This declares a function called Add
which accepts two i64
parameters, the first called a
and the second called b
, and returns an i64
result. It returns the result of adding the two arguments.
C++ might declare the same thing:
std::int64_t Add(std::int64_t a, std::int64_t b) {
return a + b;
}// Or with trailing return type syntax:
auto Add(std::int64_t a, std::int64_t b) -> std::int64_t {
return a + b;
}
Return clause:
The return clause of a function specifies the return type using one of three possible syntaxes:
->
followed by an expression, such asi64
, directly states the return type. This expression will be evaluated at compile-time, so must be valid in that context.- For example,
fn ToString(val: i64) -> String;
has a return type ofString
. ->
followed by theauto
keyword indicates that type inference should be used to determine the return type.- For example,
fn Echo(val: i64) -> auto { return val; }
will have a return type ofi64
through type inference. - Declarations must have a known return type, so
auto
is not valid. - The function must have precisely one
return
statement. Thatreturn
statement's expression will then be used for type inference. - Omission indicates that the return type is the empty tuple,
()
. - For example,
fn Sleep(seconds: i64);
is similar tofn Sleep(seconds: i64) -> ();
. ()
is similar to avoid
return type in C++.
return
statements:
The return
statement is essential to function control flow. It ends the flow of the function and returns execution to the caller.
When the return clause is omitted, the return
statement has no expression argument, and function control flow implicitly ends after the last statement in the function's body as if return;
were present.
When the return clause is provided, including when it is -> ()
, the return
statement must have an expression that is convertible to the return type, and a return
statement must be used to end control flow of the function.
Function declarations:
Functions may be declared separate from the definition by providing only a signature, with no body. This provides an API which may be called. For example:
// Declaration:
fn Add(a: i64, b: i64) -> i64;// Definition:
fn Add(a: i64, b: i64) -> i64 {
return a + b;
}
The corresponding definition may be provided later in the same file or, when the declaration is in an api
file of a library, in the impl
file of the same library. The signature of a function declaration must match the corresponding definition. This includes the return clause; even though an omitted return type has equivalent behavior to -> ()
, the presence or omission must match.
Function calls:
Function calls use a function’s identifier to pass multiple expression arguments corresponding to the function signature’s parameters. For example:
fn Add(a: i64, b: i64) -> i64 {
return a + b;
}fn Run() {
Add(1, 2);
}
Here, Add(1, 2)
is a function call expression. Add
refers to the function definition's identifier. The parenthesized arguments 1
and 2
are passed to the a
and b
parameters of Add
.
Parameters
The bindings in the parameter list default to
let
bindings, and so the parameter names are treated as r-values. This is appropriate for input parameters. This binding will be implemented using a pointer, unless it is legal to copy and copying is cheaper.
If the var
keyword is added before the binding, then the arguments will be copied (or moved from a temporary) to new storage, and so can be mutated in the function body. The copy ensures that any mutations will not be visible to the caller.
Use a pointer parameter type to represent an input/output parameter, allowing a function to modify a variable of the caller’s. This makes the possibility of those modifications visible: by taking the address using &
in the caller, and dereferencing using *
in the callee.
Outputs of a function should prefer to be returned. Multiple values may be returned using a tuple or struct type.
auto
return type:
If
auto
is used in place of the return type, the return type of the function is inferred from the function body. It is set to common type of the static type of arguments to thereturn
statements in the function. This is not allowed in a forward declaration.
// Return type is inferred to be `bool`, the type of `a > 0`.
fn Positive(a: i64) -> auto {
return a > 0;
}
Blocks and statements
A block is a sequence of statements. A block defines a scope and, like other scopes, is enclosed in curly braces (
{
...}
). Each statement is terminated by a semicolon or block. Expressions andvar
andlet
are valid statements.
The body of a function is defined by a block, and some control-flow statements have their own blocks of code. These are nested within the enclosing scope. For example, here is a function definition with a block of statements defining the body of the function, and a nested block as part of a while
statement:
fn Foo() {
Bar();
while (Baz()) {
Quux();
}
}
Assignment statements
Assignment statements mutate the value of the l-value described on the left-hand side of the assignment.
- Assignment:
x = y;
.x
is assigned the value ofy
. - Increment and decrement:
++i;
,--j;
.i
is set toi + 1
,j
is set toj - 1
. - Compound assignment:
x += y;
,x -= y;
,x *= y;
,x /= y;
,x &= y;
,x |= y;
,x ^= y;
,x <<= y;
,x >>= y;
.x @= y;
is equivalent tox = x @ y;
for each operator@
.
Unlike C++, these assignments are statements, not expressions, and don’t return a value.
Control flow
Blocks of statements are generally executed sequentially. Control-flow statements give additional control over the flow of execution and which statements are executed.
Some control-flow statements include blocks. Those blocks will always be within curly braces {
...}
.
// Curly braces { ... } are required.
if (condition) {
ExecutedWhenTrue();
} else {
ExecutedWhenFalse();
}
if
and else
if
and else
provide conditional execution of statements. Syntax is:
if (
boolean expression) {
statements}
[
else if (
boolean expression) {
statements}
] ...[
else {
statements}
]
Only one group of statements will execute:
- When the first
if
's boolean expression evaluates to true, its associated statements will execute. - When earlier boolean expressions evaluate to false and an
else if
's boolean expression evaluates to true, its associated statements will execute. ... else if ...
is equivalent to... else { if ... }
, but without visible nesting of braces.- When all boolean expressions evaluate to false, the
else
's associated statements will execute.
When a boolean expression evaluates to true, no later boolean expressions will evaluate.
Note that else if
may be repeated.
For example:
if (fruit.IsYellow()) {
Print("Banana!");
} else if (fruit.IsOrange()) {
Print("Orange!");
} else if (fruit.IsGreen()) {
Print("Apple!");
} else {
Print("Vegetable!");
}
fruit.Eat();
This code will:
- Evaluate
fruit.IsYellow()
: - When
True
, printBanana!
and resume execution atfruit.Eat()
. - When
False
, evaluatefruit.IsOrange()
: - When
True
, printOrange!
and resume execution atfruit.Eat()
. - When
False
, evaluatefruit.IsGreen()
: - When
True
, printOrange!
and resume execution atfruit.Eat()
. - When
False
, printVegetable!
and resume execution atfruit.Eat()
.
while
while
statements loop for as long as the passed expression returns True
. Syntax is:
while (
boolean expression) {
statements}
For example, this prints 0
, 1
, 2
, then Done!
:
var x: Int = 0;
while (x < 3) {
Print(x);
++x;
}
Print("Done!");
for
for
statements support range-based looping, typically over containers. Syntax is:
for (
var declarationin
expression) {
statements}
For example, this prints all names in names
:
for (var name: String in names) {
Print(name);
}
PrintNames()
prints each String
in the names
List
in iteration order.
break
The break
statement immediately ends a while
or for
loop. Execution will resume at the end of the loop's scope. Syntax is:
break;
For example, this processes steps until a manual step is hit (if no manual step is hit, all steps are processed):
for (var step: Step in steps) {
if (step.IsManual()) {
Print("Reached manual step!");
break;
}
step.Process();
}
continue
The continue
statement immediately goes to the next loop of a while
or for
. In a while
, execution continues with the while
expression. Syntax is:
continue;
For example, this prints all non-empty lines of a file, using continue
to skip empty lines:
var f: File = OpenFile(path);
while (!f.EOF()) {
var line: String = f.ReadLine();
if (line.IsEmpty()) {
continue;
}
Print(line);
}
match
match
is a control flow similar to switch
of C and C++ and mirrors similar constructs in other languages, such as Swift. The match
keyword is followed by an expression in parentheses, whose value is matched against the case
declarations, each of which contains a refutable pattern, in order. The refutable pattern may optionally be followed by an if
expression, which may use the names from bindings in the pattern.
The code for the first matching case
is executed. An optional default
block may be placed after the case
declarations, it will be executed if none of the case
declarations match.
An example match
is:
fn Bar() -> (i32, (f32, f32));fn Foo() -> f32 {
match (Bar()) {
case (42, (x: f32, y: f32)) => {
return x - y;
}
case (p: i32, (x: f32, _: f32)) if (p < 13) => {
return p * x;
}
case (p: i32, _: auto) if (p > 3) => {
return p * Pi;
}
default => {
return Pi;
}
}
}