= 5 # x is bound to an Int64 value on this 64 bit machine
x = 5, 10 # possible to assign multiple values on the same line x, y
Introduction to Julia
1 Getting started
1.1 The Julia REPL
Opening the Pumas IDE automatically starts a Julia session and displays the interactive command-line REPL (read-eval-print-loop) in the terminal panel. This mode has several useful commands.
ctrl + c
interrupts computation?
enterhelp
mode;
entershell
mode]
enter package manager modectrl + l
clears screen;
after an expression will prevent its value from showing in the REPL (has no effect in scripts)
1.2 Variables
1.2.1 Simple expressions
A value can be assigned to a new variable with the assignment operator, =
.
1.2.2 Compound expressions
Compound expressions are defined using a begin
block or the chain constructor (;
). This allows a single expression to evaluate multiple subexpressions and assign the final result to a variable.
# begin block
= begin # after: x = 10; y = 110
y = 10
x + 100
x end
# ; chain
= (a = 5; a + 30) # after: a = 5; b = 35 b
1.2.3 Naming restrictions
Variable names can use any combination of Unicode combination of character(s) with the exception of a few reserved keywords and system-defined variables (e.g., if
, end
, π
).
= 5
x = 5 # accessed with \:x:
❌ = 5 # accessed with \chi
χ
+ ❌ + χ x
1.2.4 Namespace considerations
A list of global variables and their types can be obtained for the current session using the varinfo
function.
varinfo()
Julia does not provide a method to remove variables, functions, and other objects from the current session’s namespace without restarting the session (Stop REPL
, then Start REPL
from the Command Palette). The current recommended alternative is to build a workflow around modules, but that is an advanced topic that can be revisited later. In practical terms, this only becomes relevant in more advanced workflows, but we mention it here so that the reader will have a resource to return to if needed.
1.3 Operators
This section includes a brief overview of common operators.
1.3.1 Mathematical operators
Julia implements the standard mathematical operators.
1 + 1 # Addition
2 - 4.0 # Subtraction
5 * 7 # Multiplication; also 5(7)
20 / 2 # Division
3^2 # Exponentiation
1 + 2)^3 * 2 + 1 # PEMDAS applies
(3 % 2 # modulo
sqrt(4) # square root
exp(log(1)) # base e exponentiation, log-transformation
1.3.2 Comparison operators
Standard comparison operators are available.
1 > 0 # greater than
1 ≥ 1 # greater than or equal to
0 < 1 # less than
0 ≤ 0 # less than or equal to
1 == 1 # equality
1 != 0 # inequality
1 === 1.0 # identity
Unlike most other languages, an arbitrary number of comparisons can be chained together.
1 < 2 <= 2 < 3 == 3 # true
1.3.3 Logical operators
The &&
and ||
operators are used for logical “and” and “or” operations. They also use short-circuit evaluation (i.e., they don’t necessarily evaluate their second argument). For example:
- In
a && b
,b
is only evaluated ifa
istrue
. - In
a || b
,b
is only evaluated ifa
isfalse
.
Short-circuit evaluation provides a compact alternative syntax for writing very short if
statements.
<cond> && <statement>
instead ofif <cond> <statement> end
<cond> || <statement>
instead ofif !<cond> <statement> end
Julia provides bitwise versions of these operator (|
and &
), and while they can be used for logical operations, we advise that beginners stick with ||
and &&
. For more detail, please refer to the documentation sections on bitwise operators and short-circuit evaluation.
Lastly, note that both operators associate to the right, but &&
has higher precedence than ||
.
true && false || true && false # false
2 Types
Julia’s type system is complex, but it is worth developing a basic understanding of that system before moving on to more advanced concepts.
2.1 Basic types
In programming, a literal
is any notation for representing a value (e.g., number, string, boolean, character) in source code. In contrast, identifiers
refer to a value in memory. Julia’s type system implements the following basic scalar literals, all of which are immutable:
1::Int # 64 bit integer on 64 bit Julia
1.0::Float64 # 64 bit float, defines NaN, -Inf, Inf
true::Bool # boolean, allows "true" and "false"
'c'::Char # character, allows Unicode
"s"::String # strings, allows Unicode, see also Strings below
Normally, type assertion is not needed for basic literal values. It was included above to highlight each type.
The type assertion operator (::
) in the x::Type
syntax asserts that the literal value x
is of type Type
. Type assertions for variables are made using similar syntax (x::Int = 10
) and they can used to catch bugs in your code (more on that later).
2.2 Abstract types
Julia relies on abstract types to organize its type system into a conceptual hierarchy. The technical details are presented here, but the basics can be understood with a simple example. Consider the hierarchy for numerical types shown in the tree diagram below.
flowchart TD Z[Any] --> A[Number] A[Number]:::abType --> B[Complex] A[Number] --> C[Real] C[Real]:::abType --> D[Irrational] C[Real] --> E[Rational] C[Real] --> F[Integer] F[Integer]:::abType --> G[Bool] F[Integer] --> H[Signed] H[Signed]:::abType --> I[BigInt] H[Signed] --> J[Int64\n...] F[Integer] --> L[Unsigned] L[Unsigned]:::abType --> M[UInt64\n...] C[Real] --> N[AbstractFloat] N[AbstractFloat]:::abType --> O[BigFloat] N[AbstractFloat] --> P[Float64\n...] classDef abType fill:#f96
There is a lot of information to unpack in this diagram, but for now, it is sufficient to understand the following concepts:
All data types in Julia are a subtype of
Any
.Each shaded cell (e.g.,
Number
,Real
) corresponds to anAbstractType
object.Abstract types act as organizational nodes for the types below them.
Real
andComplex
numbers are both a subtype of theAbstractType
,Number
.- While
Number
is both anAbstractType
, and a supertype ofReal
andComplex
.
2.3 Conversion
There is no automatic type conversion in Julia. The simplest, and preferred way to convert a value x
to type T
is by writing T(x)
using the appropriate constructor.
Int64('a') # character to integer
Int64(2.0) # float to integer
Int64("a") # error no conversion possible
Float64(1) # integer to float
Bool(1) # constructs to boolean true
Bool(0) # constructs to boolean false
Bool(2) # construction error
Char(89) # integer to char
string(true) # cast Bool to string (works with other types, note small caps)
There are edge cases where additional steps are required. For example, some Float
values cannot be converted directly to Int
using Int64(x)
.
Int64(1.3) # throws inexact conversion error
Instead, these values must be rounded using one of the following:
floor(Int64, 1.3)
ceil(Int64, 1.3)
round(Int64, 1.3)
Parsing a string to a number is a common task and is accomplished with parse(Type, str)
.
parse(Int64, "1") # parse "1" string as Int64
2.4 Promotion
Many operations (arithmetic, assignment) are defined in a way that performs automatic type promotion (to a common type, assuming one exists) as a work around for the lack of automatic conversion in Julia. While the user will not usually need to perform this promotion themselves, it can be achieved using promote
promote(true, BigInt(1) // 3, 1.0) # tuple (see Tuples) of BigFloats, true promoted to 1.0
promote("a", 1) # error, promotion to a common type is not possible
2.5 Special types
There are a few noteworthy “special” types to consider.
Union{} # subtype of all types, no object can have this type
Nothing # type indicating nothing (absence of a value), a subtype of Any
nothing # only instance of Nothing
Missing # type indicating missing value (a value exists but is unknown), a subtype of Any
missing # only instance of Missing
2.5.1 Missing values
Missing values are represented by the missing
object and they are propagated automatically when passed to standard mathematical functions.
missing + 1 # missing
abs(missing) # missing
Missing values also propagate through most comparison operators.
missing > 1 # missing
missing == missing # missing; must use ismissing()
There are three notable exceptions to the propagation rule (===
, isequal
, isless
). The identity operator (===
) and isequal
function always return a Bool
value and can be used to test for missing
values; however, ismissing
is the preferred method.
missing === missing # true; === always returns a Bool
isequal(1, missing) # false; similar to == except for NaN, missing, -0.0 and 0.0
missing
values are considered as greater than any other values. This also applies when sorting a collection that contains missing values.
isless(1, missing) # true
isless(missing, Inf) # false
2.6 Type verification
There are several ways to verify a value’s type.
typeof("abc") # String returned which is a AbstractString subtype
isa("abc", AbstractString) # true
isa(1, Float64) # false, integer is not a float
isa(1.0, Float64) # true
1.0 isa Number # an alternative syntax; true, Number is abstract type
supertype(Int64) # supertype of Int64
subtypes(Real) # subtypes of abstract type Real
Int <: Real # true, <: checks if type is subtype of other type
2.7 Composite Types
This is an advanced topic, but composite (“user-defined”) types are very common in Julia, so we have included a basic description here (see the documentation for more detail.)
Composite types are defined using the struct
keyword, and are immutable by default. The mutable
keyword can be added to the definition if needed. struct
s are typically given a name and a set fields to be populated.
mutable struct Patient
::Int64
age::Float64
wt::Real
htend
= Patient(25, 80.5, 182)
p # access field
p.age = 6 # change field value
p.age = "182" # error, wrong data type
p.ht = "Male" # error - no such field
p.sex fieldnames(Patient) # get names of type fields
3 Strings
This section covers the basics of working with strings. Check out the documentation for a complete discussion.
3.1 Construction
The built-in concrete type for strings and string literals in Julia is String
, and it supports the full range of Unicode characters via UTF-8 encoding. There is also the SubString
type which is used to avoid copying strings during certain operations. Both String
and SubString
are subtypes of AbstractString
. Usually, when writing your own code, it is best to assume that the user will pass an arbitrary AbstractString
.
String literals are defined using double or triple quotes. Triple-quoted strings special properties in addition to allowing ""
within a string; see the documentation for details.
"Double quotes for simple strings."
"""Triple quotes when "quoting" is needed."""
'Not allowed' # error, single quotes are used for defining Chars.
Long strings can be broken up by adding a \
before the newline.
"This is a \
long string"
The \
is also used to escape special characters within a string (i.e., convert a special character to a string literal).
println("string \t without \n escapes.") # string with tab (\t) and newline (\n)
println("string \\t with \\n escapes.") # special characters escaped with \
You can also create raw""
string literals which will treat most special characters, except double-quotes as literal values.
println(raw"string \t without \n escapes") # escapes inserted by raw""
It is possible to index into a string ("abc"[1]
); however, since Julia encodes standard strings using UTF-8, indexing is based on bytes, not characters. So correct string indexing requires you to understand how UTF-8 encoding works. See the documentation for details.
3.2 Concatenation and interpolation
Strings can be concatenated using the string
function or the *
operator.
= 2025
y = "01"
m = 23
d = "09:15:00"
t string(y, "-", m, "-", d, " ", t) # entire expression in a single string call
string(y) * "-" * m * "-" * string(d) * " " * t # using *
Concatenation can be cumbersome when multiple string
calls or *
operators are needed. Interpolation using the $
operator offers a more readable alternative.
"$y-$m-$d $t" # example for previous expression
"1 + 2 = $(1 + 2)" # complex operations can be enclosed in parentheses
"\$1,000" # to get $ symbol, you must escape it
3.3 Common operations
Several functions are available to search for substrings; most are case-sensitive.
= "Pharmacometrics"
s findfirst("a", s) # if found returns range of indices, else nothing
findfirst('a', s) # returns an integer index where Char is located
findlast("m", s) # range for last result
findnext("r", s, 4) # range for next result at index ≥ 4
findprev("a", s, 6) # range for previous result at ≤ 6
occursin("Pharma", s) # true
occursin("pharma", s) # false
In the last example, the lowercase “p” in “pharma” causes the return value to be false
. As a workaround the search string can either be normalized to remove casing, or the search pattern can be altered to be case-insensitive.
occursin("pharma", lowercase(s)) # true, see also uppercase, titlecase, and others
occursin(r"pharma"i, s) # true, uses regex (see below) with a flag, i,
# to make the search case-insensitive
Strings can be repeated.
"TA"^3 # repeat string
repeat("TA", 3) # same result
Any iterator
can be joined into a single string with the join
function.
join([1, 2, 3, 4], ", ", " and ") # see ?join for syntax details
The chop
function can be used to remove n
characters from the head
and/or tail
of a string.
= "Pharmacometrics"
s chop(s) # removes `s` from end of string
chop(s; head = 3, tail = 3) # removes the first and last 3 chars from string
There are times when having a string to represent an object is useful. The repr
function can be used to create a string from any value using the show
function.
zeros(Int64, 2, 2) # 2x2 matrix
repr(zeros(Int64, 2, 2)) # returns "[0 0; 0 0]"; easy to copy/paste elsewhere
3.4 Regular expressions
Regular expressions (regex) are a powerful tool to search for patterns in a string instead of specific values. While a full overview of regex is beyond the scope of this document, we have included a basic example below, and encourage the reader to review the documentation for more information.
= r"A|B" # create new regexp
r occursin(r, "CD") # false, no match found
= match(r, "ACBD") # find first regexp match, see the documentation for details m
4 Data structures
Each data structure (i.e., “collection”) discussed in this section is also a type.
Tuple isa Type # true
NamedTuple isa Type # true
Dict isa Type # true
Array isa Type # true
Broadly speaking, each collection can be described as (un)ordered, and (im)mutable. Ordered collections support indexing and mutable collections can be modified after being created. General guidance on how to choose an appropriate collection will be provided at the end of this section.
4.1 Tuples and NamedTuples
Tuples are immutable and ordered (indexed for 1
). They can hold a mixture of value types.
4.1.1 Construction
Tuple literals are defined with commas and parentheses, or by using the tuple
function.
tuple([1, 2, 3]) # a 1-element tuple containing a vector
1, 2, 3],) # same tuple, trailing `,` required
([# empty tuple
() 'a', false)::Tuple{Char,Bool} # tuple type assertion (
Tuples can also be constructed from an iterator using the Tuple
type constructor. (note the difference between Tuple
and tuple
).
Tuple([1, 2, 3]) # 3-element tuple after unpacking the array
NamedTuple
s are constructed using a similar syntax that will allow you to name each tuple element using a Symbol
.
NamedTuple() # constructor create an empty named tuple
= 1,) # a one element named tuple, trailing `,` required
(a = "a", y = 1) # a two element named tuple (x
Names can also be generated programmatically.
# convenience syntax to create a named tuple with a, b, c fields
(; a, b, c) # from variables
4.1.2 Common operations
Tuple and named tuple elements can be accessed via indexing, but neither can be modified.
= (1, 2, 3)
x 1] # 1 (element)
x[1:2] # (1, 2) (tuple)
x[4] # bounds error
x[1] = 1 # error
x[
# named tuple
= (a = 1, b = 2, c = 3)
y 1] # 1 (element)
y[# 1; dot syntax
y.a 1] = 1 # error - a tuple is not mutable y[
Elements from both types of tuple can be “unpacked”. The number of variables provided must match the number of elements or the remaining elements will be discarded. Non-sequential elements can be accessed by including _
.
= x # a = 1, b = 2, last element is dropped
a, b = x # a = 1, b = 3, skips second element a, _, b
The key/value pairs inside a named tuple can be accessed using the keys
and values
functions, respectively.
keys(x) # returns iterator of keys in x
values(x) # returns iterator of values in x
While both types of tuple are immutable, they can be modified using the merge
function, but this should be used sparingly. If routine modification is needed, a mutable data type should be used.
= (; a = 1)
x = (; b = 2, c = 3)
y merge(x, y, (; c = 4)) # merge y into x while also modifying c
4.2 Dictionaries
A dictionary is a unordered, mutable collection of key-value pairs.
4.2.1 Construction
Dictionaries are defined using the Dict
constructor and a series of key/value pairs. Each key/value pair is built using the pair operator (=>
), which is equivalent to the Pair
function. Any data type can be used as a key value, but the value must be unique. The paired value need not be of the same type.
# creation
Dict() # an empty dictionary
Dict("a" => 1, "b" => 2) # a filled dictionary
Dict{Float64,Int64}() # an empty dictionary mapping floats to integers
New entries can be added using the assignment operator.
= Dict("a" => 1) # dictionary with single entry
d "b"] = 2 # add new entry for key "b" with value 2 d[
4.2.2 Common operations
Dictionaries are mutable, but they do not support indexing.
1] # error
d["b"] = 3 # value of "b" updated to 3 d[
The keys
and values
functions also work on dictionaries.
keys(d), values(d) # returns tuple of iterators for keys and values in y
To check whether a specific key
already exists, use haskey
or get
.
haskey(d, "b") # check if d contains key "b"
get(d, "c", "default") # return d["c"] or "default" if not haskey(d, "c")
Dictionary entries can be removed with the delete!
function.
delete!(d, "b") # delete a key from a collection, see also: pop!
4.3 Arrays
Arrays are mutable and ordered. They can contain objects of type Any
, but most of the time they should contain objects of a more specific type (e.g., Float64
or String
).
4.3.1 Construction
There are many ways to construct an array, starting with the Array
, Vector
and Matrix
type constructors below.
Array{T}(undef, dims...) # uninitialized dense Array
Vector{T}(undef, n) # one-dimensional dense array of length n
Matrix{T}(I, m, n) # m by n identity matrix; requires using LinearAlgebra for I
The dims...
argument is common among functions that create arrays and it can accept either a single Tuple
of dimension sizes, or a series sizes passed as a variable number of arguments.
Array{Float64}(undef, (2, 2)) # uninitialized 2x2 matrix of type Float64
Array{Float64}(undef, 2, 2) # as above, equivalent syntax
There are several convenience functions that make it easy to quickly define and populate N-dimensional arrays. Many of these accept the dims...
argument and allow for type specification with their first argument (T
). In cases where type specification is allowed but omitted, the default is Float64
.
zeros(Int8, 2, 3) # 2x3 matrix of Int8 zeros
trues(3) # BitVector with all true, see also falses(dims...)
fill("a", 3) # vector filled with "a"
randn(5, 2, 2) # 5x2x2 array w/ random standard normally distributed values
Arrays literals can also be constructed directly using square brackets where [A, B, C, ...]
creates a one-dimensional array. If all arguments are of the same type, that becomes the element type (eltype
) of the resulting array. Arrays can be typed with T[A, B, C,...]
, or via promotion if possible. Heterogeneous arrays have eltype
Any
(e.g., Vector{Any}); this includes the literal [ ]
when no arguments are given.
1] # array literal (vector) with one element
[1, 2] # Array{Int64, 1} because all elements are Int64
[Float64[1, 2] # [1.0,2.0]; typed array converts Int values to float
1.0, 2] # Array{Float64, 1} via promotion to common type
[1, "a"] # Array{Any, 1} heterogeneous array, promotion not possible [
4.3.2 Initialization
It can be difficult to know in advance what size array will be needed for a specific task. In such cases, it is good practice to initialize an empty, typed, array with 0
elements that can be populated dynamically later on.
= Array{Float64}(undef, 0)
x = Float64[] # shorthand for the array above x
When possible, avoid initializing an array using []
, as this creates an array of type Any
, which retains that eltype
even if later populated entirely with another type. The impact of this may not be felt immediately, but it can lead to performance issues or errors (e.g., the array is later passed to a function that expect type Float64
).
4.3.3 Indexing
Indexing into an n-dimensional array (A
) uses the following general syntax:
...] A[i1, i2,
Where each index (i
) can be a scalar integer, an array of integers, a colon (:
), a range (a:b:c
), or an array of booleans.
Range objects are iterators that have a variety of uses throughout the Julia language. They are defined using a start:step:stop
syntax, or the range
function (see ?range
for details). To convert a range
into an array, wrap it in the collect
function.
# ranges
= range(0, stop = 1, length = 11) # an iterator having 11 equally spaced elements
x = 1:10 # iterable from 1 to 10; implied step = 1
x = 1:2:10 # iterable from 1 to 9; step = 2
x collect(x) # converts an iterator to vector
In the example below, A
is a 3x3 Matrix
. Details regarding the syntax used to initialize A
are covered in the section on concatenation.
# 1 4 7
# 2 5 8
# 3 6 9
= [1:3 4:6 7:9] # 3x3 matrix A
Indexing with the begin
and end
keywords provides the first and last elements of A
, respectively.
# first element
A[begin] end] # last element A[
Scalar indices used with a matrix return a single element via column-wise linear indexing; cartesian indexing is also available.
5] # single element (5) using linear indexing
A[2, 2] # single element (5) using cartesian indexing A[
Multiple elements can be retrieved by passing an array of indices. The return value will have the same number of dimensions as sum of dimensions for the provided indices.
2, 5, 8] # error; dimension mismatch
A[2, 5, 8]] # column vector of 3 elements; must be wrapped in [ ] A[[
A colon (:
) can be used to retrieve all elements of a dimension. In this example, A[1,:]
will return all elements of the first row of A
as a column vector. To keep the original “shape”, the first element must be wrapped in [ ]
, (e.g., A[[1],:]
).
1, :] # column vector with 3 elements, use A[[1],:] for row vector A[
Range objects separated by a comma can also be used as indices.
2:3, 1:2] # 2x2 matrix A[
Lastly, boolean (logical) indexing can be used to select specific elements that would evaluate true
.
.<5] # vector 4 elements, using logical indexing, the `.` operator is
A[A# discussed below
4.3.4 Assignment
Array elements can be assigned based upon index, but the new value must have the same dimensions, else broadcasting is required.
Broadcasting is a method of applying an operator or function element-wise across a collection (e.g., arrays). The broadcast
function has a convenient dot (.
) syntax that improves readability.
1, 2] + 10 # error, no method for + and vectors
[broadcast(+, 10, [1, 2]) # vector with 3 element, element-wise addition
1, 2] .+ 10 # as above using dot syntax [
= collect(reshape(1:8, 2, 4)) # 2x4 matrix, 2d array
x :, 2:3] = [1 2] # error; size mismatch
x[:, 2:3] .= [1 2] # OK, broadcasting with .
x[:, 2:3] = repeat([1 2], 2) # OK, 2d array
x[:, 2:3] .= 3 # OK, need to use broadcast with . x[
4.3.5 Common operations
4.3.5.1 Characterization
These functions provide an in-depth look at the size, shape, and type of an array.
= zeros(2, 3) # 2x3 matrix of zeros
x
eltype(x) # the type of elements in x
length(x) # the number of elements in x; 6
ndims(x) # the number of dimensions of x; 2
size(x) # tuple containing the dimensions of x; (2 ,3)
size(x, 1) # size of x along dimension `n`; (n=1; 2 elements)
axes(x) # tuple containing iterator for valid indices in x
axes(x, 1) # iterator for value indices along dimension `n` of x
4.3.5.2 Reshaping
There are several functions for adding or removing the elements of an array.
In Julia, a “bang” (!
) at the end of a function name indicates that the function “mutates” at least one of its arguments. Here, the !
is included as a naming convention, not an operator, and is used to distinguish between mutating and non-mutating functions (e.g., sort
, sort!
).
= [2, 1, 4, 3, 5]
A sort(A) # return a sorted copy of A
sort!(A) # sort A in-place
= collect(1:9)
A push!(A, 10) # add 10 to the end of A
= pop!(A) # return 10 and remove it from A
a splice!(A, 5) # remove and return the value at index 5, then shift remaining elements to the left
splice!(A, 2, 99) # remove and return the value at index 2, then replace it with 99
deleteat!(A, 4:6) # remove the provided indices and return the modified A
Arrays can also be reshaped without removing elements.
= reshape(1:12, 3, 4) # a 3x4 matrix-like object filled column-wise with values from 1 to 12
A vec(A) # cast an array to vector (single dimension); reuses memory
1 2]' # 2x1 Adjoint matrix (reuses memory)
[permutedims([1 2]) # 2x1 matrix (permuted dimensions, new memory)
4.3.5.3 Concatenation
The basics of concatenating one- and two-dimensional arrays are discussed below. Information on joining higher-dimension arrays can be found in the documentation.
Separating array arguments with a single semicolon (;
) or newline
instead of a comma will vertically concatenate them.
1:2, 4:5] # vector of ranges with 2 elements
[1:2; 4:5] # vector of integers with 4 elements
[
[1:2
4:5
# as above ]
Similarly, separating arguments with a tab, space
, or double semicolons (;;
) will horizontally concatenate them.
1:2 4:5] # 2x2 matrix using spaces
[1:2;; 4:5] # 2x2 matrix using double ;, space added for readability [
Even though spaces, tabs and ;;
all mean concatenation in the second dimension the latter cannot appear in the same expression unless it is serving as a line continuation character.
[1 2;; # 1x4 matrix, ;; acts as line continuation only
3 4
]
Each of these symbols can be combined to concatenate both vertically and horizontally at the same time. When combining these operations, be aware that spaces and tabs have a higher precedence than any number of semicolons.
1 2]; 3 4; [5 6]] # 3x2 matrix [[
In addition, (;
) has higher precedence than (;;
), which means that in an expression using both (;
) and (;;
), vertical concatenation will occur first before horizontally concatenating the result.
1:2; 3;; 4; 5:6] # 3x2 matrix [
Lastly, there is a set of convenience functions for concatenation (see ?cat
for more details).
cat(A..., dims) # concatenate input arrays along dimension(s) k
vcat(A...) # shorthand for cat(A..., dims=1), equivalent to [A; B; ...]
hcat(A...) # shorthand for cat(A..., dims=2), equivalent to [A B ...]
hvcat() # simultaneous vertical and horizontal concatenation, equivalent to [A B; C D]
4.4 Choosing a collection
Consider the following exercise where each data structure and the previously-defined composite type is populated with the same information. The varinfo
function is used to compare memory usage.
= 25
age = 80.5
wt ::Real = 182
height
= (age, wt, height) # tuple, 24 bytes
_tp = (; age, wt, height) # named tuple, 24 bytes
_ntp = Dict(pairs(_ntp)) # dictionary, 480 bytes
_d = [age, wt, height] # array, 64 bytes
_a = Patient(age, wt, height) # Patient, 32 bytes
_p
varinfo(Main, r"_")
- Dictionaries offer a lot of flexibility, but consume the most memory
- Tuples are the most memory efficient but are also the least flexible. NTPs offer a few more options without an additional memory cost but require more key strokes
- Arrays offer a balance between flexibility and memory efficiency which makes them the workhorse data structure (they can hold almost anything and do it efficiently)
- Structs are only slightly less efficient than tuples (note, named tuples are just anonymous structs), but can be modified after creation. Trade-off comes from the added complexity of defining your own type.
5 Programming constructs
This section provides an overview of standard programming constructs implemented in Julia.
5.1 Control flow
The basic elements of control flow are conditional (if-else
) and repeated (“loops”) evaluation.
5.1.1 Conditional evaluation
Julia code can be evaluated in a branching fashion based upon the value of a boolean expression inside an if-else
block.
if false
= 3 # false, no assignment
x else
println("$(1+1)") # else statement is used
end
The same expression could be written using the ternary
operator (condition
? true-action
: false-action
) which offers a more terse syntax.
false ? x = 3 : println("$(1+1)")
In practice, the boolean value will be determined by a comparison operator.
if "x" == "y" # false
= 1
z elseif 1 > 2 # false
= 2
z else
= 3 # value 3 assigned to a
a end
Comparison operators can also be combined (&&
) or modified (!
) using boolean operators:
if true && !false # true with false negated to true
= 1 # value 1 is assigned to z
z else
= 3
a end
5.1.2 Repeated evaluation
There are two basic constructs for repeat evaluation, while
loops and for
loops. The former is useful when the total number of iterations needed to reach a stopping condition is unknown while the latter is useful for iterating over the elements of a collection. Consider a simple while
loop:
= 1 # a counter
i while true # a condition to evaluate, loop will continue until false or break condition
global i += 1 # using increment operator (+=) to add 1 to i, global keyword discussed below
> 10 && break # command to break out of a loop (stop iteration) immediately
i end
println("The value of i is: $i")
The global
keyword in the example is related to the variable scope created by the while
construct. Without the global
keyword, the code inside the while
loop cannot “see” the variable i
that was defined outside the loop (more on this topic later in the tutorial).
The break
keyword ensures that iteration stops once a specific condition is met (i.e., once i > 10
). The break
statement could have been omitted if the stop condition was defined at the start of the loop. beginning of the loop.
= 1
i while i <= 10
global i += 1
end
println("The value of i is: $i")
In contrast, for
loops iterate over all of the elements in a collection and then stop.
for v = 1:10 # v in collection, can also use v=1:10
if 3 < v < 6
continue # skip one iteration
end
println(v)
end
varinfo(v) # error, v is only defined in the inner scope of the loop
The continue
keyword after the conditional skips one iteration of the loop (i.e., the loop will “skip” the println
function and move to the next iteration).
Nested for
loops allow for iteration over multiple ranges.
for i = 1:2
for j = 3:4
println((i, j))
= 0
i end
end
Note that multiple nested for
loops can be condensed to a single outer loop.
for i = 1:2, j = 3:4
println((i, j))
= 0
i end
However, there are slight differences in their output. In the first example, the first element of the second and fourth tuple is 0
while in the second those elements are 1
and 2
, respectively. This is because in the second example, both iterators (i
and j
) are set to their current values at the beginning of each iteration and any changes to iterators inside the code block do not affect subsequent iterations. In addition, the inclusion of a break
statement in the second example would stop the entire loop, not just the inner loop.
println("Loop 1")
for i = 1:2
for j = 3:4
> 3 && break
j println((i, j)) # iterations 1 and 3 will be printed
end
end
println("Loop 2")
for i = 1:2, j = 3:4
> 3 && break
j println((i, j)) # only the first iteration is printed
end
Lastly, multiple collections can be iterated over at the same time in a single for
loop using the zip
function.
for (i, j) in zip([1 2 3], [4 5 6 7])
println((i, j))
end
The zip
function creates an iterator that is a tuple containing subiterators for each of the containers passed to it. Each subiterator is iterated over in order and the loop will stop once any of the subiterators runs out.
5.1.3 Applications
5.1.3.0.1 Iteration
for
-loops are commonly used to iterate over a collection. There are two basic syntaxes, the first retrieves an element, the second retrieves an index.
for a in A
# Do something with the element a
end
for i in eachindex(A)
# Do something with i and/or A[i]
end
In contrast with for i = 1:length(A), iterating with
eachindex` provides an efficient way to iterate over any array type.
5.1.3.1 Comprehensions
Comprehensions also make use of the for
keyword and offer a concise syntax for creating arrays.
A = [f(x,y,...) for x=rx, y=ry, ...]
This code inside the [ ]
can be interpreted as “evaluate function (f
) for each value of x
and y
in ranges rx
and ry
”. In practice, rx
and ry
can be any iterable object, but they are usually ranges (e.g., 1:n
). The resulting array’s type will depend on the computed elements just like any other array literal. As before, the type can be controlled by prepending a type declaration to the comprehension.
= Float64[2x + 1 for x = 1:10] # vector with 10 elements A
Example usage:
= [x * y for x = 1:2, y = 1:3] # 2x3 array of Int64;
a sum(a, dims = 2) # calculate sums for 3rd dimensions, similarly: mean, std,
# prod, minimum, maximum, any, all;
# using Statistics is required for statistical functions
count(>(0), a) # count number of times a predicate is true, similar: all, any
# note that we create an anonymous function with >(0) here
5.1.3.2 Generator expressions
Comprehensions can be written with ( )
instead of [ ]
, producing an object known as a generator
. Generators can be iterated on demand to produce a value without allocating memory for an array in advance.
= (1 / n^2 for n = 1:1000) # simple generate
g sum(g) # sum of a series without memory allocation, could
# have also summed directly sum(1/n^2 for n=1:1000)
When writing generator expressions with multiple dimensions inside an argument list, parentheses are needed to separate the generator from the subsequent arguments. This is because all comma-separated expressions after the for
are interpreted as ranges.
# map(tuple, 1/(i+j) for i=1:2, j=1:2, [1:4;]) # error, invalid iteration specification
map(tuple, (1 / (i + j) for i = 1:2, j = 1:2), [1:4;]) # vector with 4 tuple elements
Ranges in generators and comprehensions can depend on previous ranges by writing multiple for
keywords. In such cases, the result is always 1-d.
= 1:3 for j = 1:i] # vector with 6 tuple elements [(i, j) for i
Generated values can be filtered using the if
keyword.
= 1:3 for j = 1:i if i + j == 4] # vector with 6 tuple elements [(i, j) for i
5.2 Functions
5.2.1 Construction
Functions can be defined using the function
keyword, or the shorter, inline “assignment” form. A function, f
, is defined using both approaches in the example below.
function f(x, y) # using function keyword
return x + y
end
f(x, y) = x + y # using assignment form
In the first example, the return
keyword tells the enclosing function (f
) to exit “early” (i.e., any lines that come after return
would be ignored) and to return the value of the x + y
expression. If omitted, as in the second example, the function will return the value of the last expression to be evaluated.
Functions can also be defined without providing a “name” which creates an anonymous function.
function (x, y) # same as before, note that name 'f' is omitted
return x + y
end
-> x + y # similar terse syntax using the arrow (->) operator
(x, y) # can omit ( ) for single argument
These anonymous functions are primarily used as arguments for other functions.
map(x -> x + 3, 1:10) # map elements of range to anonymous function
map(1:10) do x # same map using do-block syntax
+ 3
x end
In the second example above, the do
block creates an anonymous function that is then passed as the first argument to a function call (e.g., the map
). This syntax is especially helpful when using more complex anonymous functions that span multiple lines.
Lastly, functions can be stored in variables just like any other object.
= f(3, 3) # f called using parentheses; assigns value of 3+3 to y
y = f # parentheses omitted to assign function f to g
g g(3, 3) == y # true
5.2.2 Multiple returns
As stated earlier, multiple variables can be assigned at once by including a comma-separated list (optionally wrapped in ( )
) on the LHS (left-hand side) of an expression. The RHS must be an iterator at least as long as the number of variables (any extra elements of the iterator are ignored). The process of iterating over the RHS object and assigning each element to a variable is called destructuring.
= 1:4 # a=1, b=2, c=3; 4 is dropped a, b, c
This feature allows a function to return multiple values as a Tuple
or other iterable value.
function f(x, y)
+ y, x * y # returns both the sum and product of x and y
x end
f(2, 3) # returns a tuple, (5, 6)
= f(2, 3) # tuple destructured to a=5, b=6 a, b
5.2.3 Arguments
5.2.3.1 Passing arguments
Functions accept positional or keyword arguments. The latter are sometimes referred to as kwargs and they are separated from positional arguments by a semicolon (;
). In the example below f
has two positional arguments (x
, y
) and one kwarg (z
).
f(x, y; z) = x + y * z
f(5, 10; z = 15) # kwargs names must be specified
# f(5, y=10; z=15) # error, positional args cannot include name
5.2.3.2 Optional arguments
Function definitions can include optional positional and keyword arguments. However, only the last positional argument(s) can have a default value.
f(x, y = 10; z = 15) = x + y + z # f now includes two default values (y, z)
f(5) # 30
# f(x=10, y; z=15) = x + y + z # error, only last positional argument(s) allowed
f(x, y = 10; z = 15, a) = x + y + z + a # valid
5.2.3.3 Typed arguments
As with other objects, the arguments passed to a function can be restricted to a specific type.
function f(x::Int, y::Int) # f will only accept integers
+ y
x end
However, in most cases, Julia will identify the type of data provided and compile a specialized version of the function that is suited for that type. There are a limited number of scenarios where argument type declarations are needed, and until there is a clear need for them, it’s best to avoid them.
Common reasons for declaring argument types include:
- Dispatch: Functions can have multiple methods (see below) which behave differently for a given set of argument types.
- Correctness: Some functions will only return a correct result for a certain argument type (common for
Int
andFloat
). - Documentation: Type declarations can serve as a form of documentation for expected arguments in a complex function.
5.2.3.4 Pass by reference
In Julia, if a mutable object is passed as a function argument and modified inside the function, those changes will reflect outside the function, even if the modified argument is not explicitly returned.
function f(x, y)
1] = 42 # modifies x outside
x[= 7y # y bound to a new value, no modification outside
y return y
end
= [1, 2, 3]
a = 3
b
f(a, b) # return 7(3)
# [42,2,3]; modified
a # 3; unmodified b
In the example above, f
assigns a value of 42
to the first element of the column vector a
, which was passed to f as its first positional argument, x
. The scalar literal 3
, stored in a
is passed as the second positional argument, b
. The function f
assigns the product of 7(3)
to y
and returns the value of y
to the caller. The value of b
is unchanged despite y
being assigned a new value inside of f
because Int
values are immutable. In contrast, the array a
is mutable and passed to f
by reference. When the first element of x
was assigned a new value, that change was reflected in a
even though x
was not included in the return statement.
The reason for this behavior is somewhat technical, but it can be understood intuitively if you think of an array as a container with a unique identifier. When Julia assigns the array [1,2,3]
to variable a
, it is NOT basing that assignment on the contents of the array. Instead, it is assigning the array’s unique identifier to a
. This is the reason that changing the first element of x
is not reflected in a
; x
and a
are still pointing to the same array. To further illustrate this point, if the expression x[1] = 42
was replaced with x = 42
, the value of a
would have remained unchanged outside of f
. The act of assigning a new value to x
“breaks” the reference to a
’s identifier and creates a new object.
If unaccounted for, this behavior can lead to downstream errors, and users should take care to avoid introducing bugs into their code. The following tips should be helpful:
- Avoid using function arguments on the LHS of any assignment operator inside a user-defined function unless that function is intended to modify the argument.
- Adhere to the convention of appending a
!
to the end of mutating functions. - Use
copy
ordeepcopy
inside the function body to create copy of any argument that will receive an array or dictionary.
Note that “copied” data will be unaffected by modifications made within a user-defined function. However, copy
only creates a so-called “shallow copy” up to the first level of a mutable object. In contrast, deepcopy
creates a fully distinct copy, but is more computationally expensive than copy
. The difference between these two function is highlighted in the example below. When in doubt, deepcopy
will ensure that an argument remains unchanged.
= Array{Any}(undef, 2) # new undefined array
x 1] = [1, 2, 3] # assign element x1
x[2] = [4, 5, 6] # assign element x2
x[= x # assign value of x to a
a = copy(x) # create a shallow copy of x and assign its value to b
b = deepcopy(x) # create a deep copy of x and assign its value to c
c 1] = 99 # update value of x1
x[2][1] = 99 # update value of x[2][1]
x[# identical to x
a # only x[2][1] changed from the original x
b # contents of the original x c
5.2.4 Methods
Each function can have multiple methods; a different set of instructions based on argument type. The choice of which method to use is called dispatch, and most languages only use the first argument when choosing between methods. In contrast, Julia uses all of a function’s arguments to determine which method is appropriate. This process is referred to as multiple dispatch and it’s a huge part of why Julia is so fast.
Consider the example below:
g(x) = println("$x is not an Integer!")
g(x::Int) = 3x
methods(g) # g has 2 methods
In the first method, the variable x
is of type Any
and prints a String
. In the second, x
must be an integer and returns the product 3x
.
It is important to note that multiple dispatch only applies to positional arguments. Keyword arguments are processed after method selection.
t(; x::Int = 2) = x
t(; x::Bool = true) = x
t() # true; old value of t was overwritten
6 Variable scoping
This is an intermediate level topic, but it is worth introducing here since new users often encounter errors related to scope when defining loops and functions. The topic is covered in detail in the documentation.
A variable’s scope is determined by a set of (scoping) rules that help determine whether the code that surrounds it can “see” that variable. As a result, two functions, f(x) = x
and g(x) = x
, can both have a positional argument, x
, without causing a naming conflict. In each case, the scope of x
is limited to the function in which it is defined; f
does not know that x
was also defined in g
.
In Julia, there are two main types of scope, global scope, and local scope. When a Julia session is started, the default module
(i.e., coding “workspace”), called Main
is loaded which, in turn, introduces a new global scope. Certain programming constructs including comprehensions, generators, function
, while
, for
, and do
(among others) introduce a new local scope. It is worth noting that, if
and begin
blocks do not introduce a new scope.
To further understand these concepts, consider the following example:
# error, t is undefined
t f() = global t = 1 # f assigns t=1 using global keyword
f() # t is defined globally after calling f
function f1(n) # f1 introduces new local scope
= 0 # x, local variable within scope of f1
x for i = 1:n # for introduces a new, "inner" local scope
= i # local x already exists, so i is assigned to the existing x
x end
# returned x will have same value as n
x end
f1(10) # 10; inside the loop we use the outer local variable
function f2(n)
= 0
x for i = 1:n
local x # local keyword creates new x inside inner local scope
= i
x end
xend
f2(10) # 0; x in outer local scope remains unchanged
function f3(n)
for i = 1:n
= i # no local keyword, h not visible to outer local scope
h end
# undefined
h end
f3(10) # error; h not defined in outer scope
function f4(n)
local h # h defined using local keyword, assignment not required
for i = 1:n
= i # h already exists in the outer local scope; assigned value i
h end
hend
f4(10) # 10; h is defined in outer scope
Note that for
, while
, try
and struct
use a so-called soft local scope. Simplifying a bit, if you are working interactively (e.g., REPL, notebook) and use them in a top level (global) scope they will overwrite an existing global variables
julia> x = 5
5
julia> for i in 1:10
x = i
end
julia> x
10
However, the same code passed in an non-interactive session prints a warning and does not overwrite a global variable:
~$ julia -e "x=5; for i in 1:10 x = i end; println(x)"
| Warning: Assignment to `x` in soft scope is ambiguous because a global variable
by the same name exists: `x` will be treated as a new local. Disambiguate by using
`local x` to suppress this warning or `global x` to assign to the existing global variable.
| @ none:1
5