# Julia Basics

Authors

Jose Storopoli

Kevin Bonham

Juan Oneto

You’ve already seen how to assign variables in Julia, which is not so dissimilar from R. Many other things will be familiar as well, and throughout these tutorials, we will try to point out where differences exist or may be confusing.

## 1 🧮 Numbers and Math

Julia was designed for technical and mathematical computing, and a great deal of effort has been put in to make math in code look like and work like math written on paper.

This means that a lot of simple operations work just like you would expect:

42 * 2
84
1.3e4 / 1000
13.0
5 % 2 # remainder
1
Note

# makes the remainder of the line into a comment, just like R and Python

# order of operations is PEMDAS
(1 + 2)^3 * 2 + 1 # 3^3 * 2 + 1 => 27 * 2 + 1 => 54 + 1
55

Some mathematical operations use functions, which just like in R, are called using the function name, with arguments to the function surrounded by parentheses:

sqrt(10)
3.1622776601683795

In R, the above looks like:

> sqrt(10)
[1] 3.162278

But many functions of this sort also have unicode-based equivalents. For example, the following is identical to sqrt(10):

√10 # this is typed \sqrt<TAB>
3.1622776601683795
Note

In fact, all of the mathematical symbols above are actually Julia functions. For example, 3 + 4 is actually just shorthand for +(3, 4)

You’ll learn much more about Julia functions in a future tutorial.

## 2 ✔️ ❌ Boolean basics

Boolean values are lowercase in Julia (eg true and false rather than TRUE and FALSE), but you can do basic comparisons as you do in R:

1 < 3 # 1 is less than 3
true
5 * 2 == 11 # 5 * 2 is equal to 11
false

And you can negate a boolean expression with !

!(5 * 2 == 11) # or, in this case, 5 * 2 != 11
true

There are also many functions that return boolean values that are often used for conditional evaluation (if / else statements).

isodd(3)
true
!isodd(3) # read "3 is not odd"
false

Boolean expressions can also be combined using && for “AND” and || for “OR”. && returns true if both statements are true, while || returns true if either statement is true.

For example:

isodd(3) && isodd(4) # 3 is odd AND 4 is odd
false
iseven(3) || iseven(4) # 3 is even OR 4 is even
true
Caution

In Julia Boolean values are a subtype of Integer, and can be used in some mathematical operations as 0 (for false) and 1 (for true). For example:

julia> 1 + true
2

But the reverse is not true. That is, you cannot use 1 in an if/else statement. This is in contrast to R, where any number other than 0 is considered TRUE, and 0 is considered FALSE.

r$> ifelse(1, 10, 20) [1] 10 r$> ifelse(2, 10, 20)
[1] 10

r$> sqrt(a_vec) [1] 1.000000 1.414214 1.732051 2.000000 2.236068 But in Julia, the square root of a vector is undefined: a_vec = [1, 2, 3, 4, 5] 5-element Vector{Int64}: 1 2 3 4 5 sqrt(a_vec) MethodError: MethodError(sqrt, ([1, 2, 3, 4, 5],), 0x000000000000831d) In Julia, there are several different ways to accomplish this. Tip In R, it’s very important to make sure all operations are vectorized, since loops written in R are incredibly slow. This is not true in Julia - loops can sometimes be faster! #### 4.3.1 🗺️ map The map() function takes a function as its first argument, and a container as the second. It then applies the function to each item in the container, returning another container. This is analogous to the sapply() function in R, though it is much more flexible as we’ll see in future tutorials. For example: map(sqrt, a_vec) 5-element Vector{Float64}: 1.0 1.4142135623730951 1.7320508075688772 2.0 2.23606797749979 Caution Note that the order of arguments is reversed relative to sapply(). In Julia, the function being applied comes first, and the container it applies to comes second. Julia functions that are verbs can often be reasoned about if you put them into a sentence, with the arguments in the same order. Eg. map(sqrt, a_vec) is “Map sqrt to a_vec”, and contains("banana", "ana") is “banana contains ana?” For more on map() and using it to apply functions, see the Functions tutorial. #### 4.3.2 🤏 reduce, and mapreduce It can also be useful to collapse a container into a single value using some operation. We can do this using the reduce() function (which works similarly to reduce() from the purrr package in R). For example, suppose that you want to multiply all of the numbers in a vector to one another. reduce(*, a_vec) 120 Keep in mind that you should only use commutative operations or operations where the order doesn’t matter. To be fast, reduce() may apply the operation on items in an order you don’t expect. The mapreduce() function is like combining map() and reduce(). In other words, mapreduce(op1, op2, container) should be identical to reduce(op2, map(op1, container)), with the benefit that Julia doesn’t need to make the intermediate container (for reasons not worth going into, creating large vectors can be slow). So, if we want to multiply all of the square roots of a_vec: mapreduce(sqrt, *, a_vec) 10.954451150103324 # just to prove it reduce(*, map(sqrt, a_vec)) 10.954451150103324 #### 4.3.3 🤔 Comprehensions Containers can also be created using “comprehensions.” If you are familiar with using for loops, comprehensions are like mini for loops, and even have a similar syntax in Julia. For example, the following is identical to map(sqrt, a_vec) [sqrt(x) for x in a_vec] 5-element Vector{Float64}: 1.0 1.4142135623730951 1.7320508075688772 2.0 2.23606797749979 One exceptionally useful thing about comprehensions is that they can be combined with conditional evaluation, so that only things that match some boolean statement will be included. For example, the following only takes the square root of odd numbers: [sqrt(x) for x in a_vec if isodd(x)] 3-element Vector{Float64}: 1.0 1.7320508075688772 2.23606797749979 We can also make dictionaries and other containers # for reference my_dict Dict{Any, Any} with 4 entries: "new key" => "😉" "a key" => "a value" 'b' => 42 1 => 2 Dict(k => my_dict[k] for k in keys(my_dict) if k isa String) Dict{String, String} with 2 entries: "new key" => "😉" "a key" => "a value" ## 5 ⚠ Interlude on types You can do a lot in Julia without worrying too much about the types of the objects that you’re working with. But everything in Julia has a type, and it’s good to be aware of them, if only to recognize errors that might show up due to them. In Julia, types exist in a hierarchy. Every object has a “concrete” type, and some number of “abstract” parent types. For example, Int16, Int32, and Int64 are concrete types representing 16-bit, 32-bit, and 64-bit integers respectively. All of these types are subtypes of the abstract type Signed, which is itself a subtype of Integer (there are also “unsigned” integer types, like UInt64). A Float64 is a 64-bit floating point number. It’s not a subtype of Integer, but it shares the abstract type Real with all Integer types. typeof(1) Int64 typeof(1.0) Float64 supertype(Int64) Signed Or view all of the supertypes: using InteractiveUtils: supertypes supertypes(Int64) (Int64, Signed, Integer, Real, Number, Any) 1.0 isa Integer false 1.0 isa Float64 true 1.0 isa Real true 1 isa Float64 false 1 isa Real true Containers also have types, and in fact are generally “parameterized” based on the types they contain. new_dict = Dict('a' => 1, 'b' => 2, 'c' => 3) Dict{Char, Int64} with 3 entries: 'a' => 1 'c' => 3 'b' => 2 typeof(new_dict) Dict{Char, Int64} Notice the Char and Int64 inside the curly braces - those represent the types of the keys and values respectively. Why do I bring this up now? Well, look what happens when I try to add a new key / value pairs, without paying attention to the types: new_dict['d'] = 4.0 4.0 typeof(4.0) Float64 typeof(new_dict['d']) Int64 new_dict['e'] = 4.5 InexactError: InexactError(:Int64, Int64, 4.5) When I added the value 4.0, even though it was a Float64, Julia was able to coerce it into an Int64. But 4.5 can’t be converted to an integer without losing information. we could explicitly round it, but Julia won’t do that for us. new_dict['e'] = round(Int, 4.5) 4 new_dict["I'm a String, not a Char"] = 5 MethodError: MethodError(convert, (Char, "I'm a String, not a Char"), 0x0000000000008322) So why was I able to add all kinds of different keys and values to my_dict up above? Take a look at its type signature: typeof(my_dict) Dict{Any, Any} In Julia, all types are subtypes of Any. Because I initially made the dictionary with a bunch of different types, Julia could not provide it with a specific parameterization, so it just did the broadest possible one. Caution Here are some other examples of type issues in containers. Don’t worry too much about the details, but try to pay attention to what types you’d expect, what actually happens, and the errors that are (or are not!) induced: floatvec = [10, 11.0, 12] 3-element Vector{Float64}: 10.0 11.0 12.0 typeof(floatvec[1]) Float64 intvec = Int64[10, 11.0, 12] 3-element Vector{Int64}: 10 11 12 typeof(intvec[2]) Int64 anyvec = Any[10, 11.0, 12] 3-element Vector{Any}: 10 11.0 12 Int64[3, 3.5, 4] InexactError: InexactError(:Int64, Int64, 3.5) push!(intvec, 12.5) InexactError: InexactError(:Int64, Int64, 12.5) anum = 10 10 typeof(anum) Int64 push!(floatvec, anum) 4-element Vector{Float64}: 10.0 11.0 12.0 10.0 typeof(floatvec[4]) Float64 push!(intvec, '1') # 49 4-element Vector{Int64}: 10 11 12 49 Caution This one surprised me too! Character literals (like ‘1’) are based on the UTF-8 standard, where each character has a numerical value, which can be converted to an integer. push!(intvec, "1") MethodError: MethodError(convert, (Int64, "1"), 0x0000000000008322) push!(anyvec, "1") 4-element Vector{Any}: 10 11.0 12 "1" push!(intvec, parse(Int64, "1")) 5-element Vector{Int64}: 10 11 12 49 1 ## 6 Miscellany Here are some additional bits that are useful to introduce at an early stage. You don’t need to keep these things in your head, but hopefully when you see them later, it will jog your memory. ### 6.1 Collect Many “array-like” things in Julia aren’t actually arrays, but can be treated as such. This has a number of advantages. For example, consider an array of odd numbers from 1 to 2,000,000. To put this into an array, you would need to store 1 million integer objects (assuming the typical 64-bit integer, that’s 8 Mb of memory). Instead, you can store 3 integers in a “range”. Tip Writing 2,000,000 would be parsed as a tuple (2, 0, 0), rather than as the integer 2 million. Instead, we can use _ for visual separation, which the Julia parser ignores in integers. So 2_000_000 is identical to 2000000 or 2_00000_0 # range syntax for start : step : stop my_range = 1:2:2_000_000 1:2:1999999 This doesn’t actually materialize any of the numbers that are part of this range, but can still be indexed into, or used for indexing sizeof(my_range) # gives size in bytes 24 my_range[1000] # the thousandth number in the range 1999 my_range[[1000, 1200, 11]] 3-element Vector{Int64}: 1999 2399 21 And some algorithms can use fancy tricks to optimize calculations. For example, sum can use an optimization to calculate this almost instantly using BenchmarkTools @benchmark sum($my_range) # less than 5 nanoseconds
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max):  5.189 ns … 16.838 ns  ┊ GC (min … max): 0.00% … 0.00%
Time  (median):     5.204 ns              ┊ GC (median):    0.00%
Time  (mean ± σ):   5.292 ns ±  0.450 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

██▄▂▁                  ▄▂         ▃▃                     ▃ ▂
██████▆▅▄▆▅▃▃▁▃▁▁▁▁▁▁▁▁██▆▅▆▄▆▅▅▄▄██▁▁▁▃▁▁▁▁▁▃▅▁▁▁▁▃▁▁▁▁▁█ █
5.19 ns      Histogram: log(frequency) by time     5.96 ns <

Memory estimate: 0 bytes, allocs estimate: 0.

A range (and some other types) can work just like a vector because it is a subtype of AbstractArray, and many functions don’t care about the internal details, they just care that they can get out indices, know the length of the object, etc. Many other “iterators” work the same way.

Nevertheless, sometimes you do actually need the concrete vector, in which case you can use the collect() function:

typeof(my_range)
StepRange{Int64, Int64}
range_as_vector = collect(my_range)
typeof(range_as_vector)
Vector{Int64} (alias for Array{Int64, 1})
sizeof(range_as_vector) # compare this to the 24 bytes used before
8000000
@benchmark sum(\$range_as_vector)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max):  230.111 μs … 594.297 μs  ┊ GC (min … max): 0.00% … 0.00%
Time  (median):     234.146 μs               ┊ GC (median):    0.00%
Time  (mean ± σ):   237.069 μs ±   7.594 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

▁▃▆▇██▇▆▆▅▄▃▂▂▂▂▁        ▁▂▃▃▄▄▅▄▄▄▃▂▂▁▁      ▁  ▁        ▃
▇██▆███████████████████▇▇▇█▇██████████████████▇██████████████ █
230 μs        Histogram: log(frequency) by time        253 μs <

Memory estimate: 0 bytes, allocs estimate: 0.

### 6.2 🪠 Pipes

Sometimes, it can be convenient to chain functions together in a single line. For simple expressions, this can be done in Julia using the “pipe” operator |>, which pipes the output from one expression into the input of the next. In other words, x |> y is equivalent to y(x).

The following are equivalent:

my_range |> collect |> sum

# and

sum(collect(my_range))

But this really only works for single-argument functions. As we’ll see, the Chain.jl package can be used for more complex operations. With Chain.jl, the result from each line of a calculation is passed implicitly as the first argument in the next.

using Chain

@chain my_range begin
collect
sum
end
1000000000000

This is of course a trivial example, we’ll see much more complicated versions in future tutorials.