In the previous installment of our introduction to functional programming we looked at reading values from nested data structures. In this final post we look at the flip side of working with nested data structures, updating them. If you have not read the previous post yet and are not familiar Elixir, you might want to read it now, as this post builds on that one.
The Challenge
If navigating through a nested data structure to retrieve a value seems challenging, updating a nested structure may appear hopeless, especially since Elixir and FP languages in general have immutable data. In Java and other languages that support mutable data structures we can change nested values in-place. This usually means changing a nested value is no harder than reading it. As long as we can navigate to the value we want to change we can do whatever we want with it.
With immutable data we can’t simply change one value in a data structure. If we want to update a value, we have to return a whole new data structure with the value changed. This is not as bad as it sounds. First of all, since we know the original data structure can’t be changed (immutable, remember?) our new structure can be composed of the original structure plus the part that has changed. Also, we don’t have to construct this ourselves; Elixir provides an API that makes this, well, not easy, but easier than it would be without it.
Let’s start with a fairly simple nested map. Java has no map literals, so we have to construct an empty map and use mutation right out of the gate.
Map> myMap = new HashMap>();
myMap.put("Joe", new HashMap());
myMap.get("Joe").put("age", 30);
myMap.get("Joe").put("weight", 170);
myMap.put("Bob", new HashMap());
myMap.get("Bob").put("age", 32);
myMap.get("Bob").put("weight", 172);
=> {Joe={weight=170, age=30}, Bob={weight=172, age=32}}
Elixir supports map literals, so initializing a map is simpler.
my_map = %{"Joe" => %{age: 30, weight: 170}, "Bob" => %{age: 32, weight: 172}}
=> %{"Bob" => %{age: 32, weight: 172}, "Joe" => %{age: 30, weight: 170}}
As an aside, notice that we use atoms (like symbols in Ruby) for the keys that don’t vary (age, weight) and strings for the keys that do (name).
Let’s say Joe has gained a little weight and we want update our map to reflect this. In Java this is straight-forward. In fact, it’s exactly what we did to initialize our map.
myMap.get("Joe").put("weight", 175);
=> {Joe={weight=175, age=30}, Bob={weight=172, age=32}}
For maps, Elixir is almost as simple as Java. For this we have the Kernel.put_in
function. It takes a list of keys and uses them to navigate a data structure, in the same way as get_in
which we used in the previous post.nget_in
and put_in
are part of a family of functions that share the same semantics for working with nested data structures.
For our nested map we have the following:
new_map = put_in(my_map, ["Joe", :weight], 175)
=> %{"Bob" => %{age: 32, weight: 172}, "Joe" => %{age: 30, weight: 175}}
We can use a short-cut syntax to get even closer to the Java style.
new_map = put_in my_map["Joe"][:weight], 175
=> %{"Bob" => %{age: 32, weight: 172}, "Joe" => %{age: 30, weight: 175}}
Now suppose we know that Joe’s weight increased by ten percent, but we don’t knownthe actual value. In Java we have
int joeWeight = (int) (myMap.get("Joe").get("weight") * 1.1);
myMap.get("Joe").put("weight", joeWeight);
=> {Joe={weight=187, age=30}, Bob={weight=172, age=32}}
In Elixir when we want to update to some function of the current value we use
new_map = update_in my_map["Joe"][:weight], fn w -> w * 1.1 end
=> %{"Bob" => %{age: 32, weight: 172}, "Joe" => %{age: 30, weight: 187}}
The biggest difference in the two approaches is that with update_in
we pass a function in (remember higher order functions from the first post?) instead of a value. The function takes the current value and returns the new one.
The function definition is the fn w -> w * 1.1 end
bit.
Because put_in
and update_in
follow the same semantics as get_in
we have a lot of flexibility for updating nested structures.
Suppose a year has passed and we need to update everyone’s age. In an imperative language we don’t have a lot of options. Using a non-functional approach in Java we would have to update every record separately, like this:
for (Map.Entry> entry : myMap.entrySet()) {
entry.getValue().put("age") = entry.getValue().get("age") + 1;
}
In Elixir we can pass a function saying “choose everything” as the first key in our list of keys passed to update_in
. We’ll call this function all
, meaning we want to work on all the keys at the top level. This is just a style choice, we could call it foo
nand update_in
would not care.
new_map = update_in my_map, [all, :age], fn age -> age + 1 end
=> %{"Bob" => %{age: 33, weight: 175}, "Joe" => %{age: 31, weight: 170}}
Before defining the all
function it’s helpful to understand how keys work in update_in
(or get_in
, put_in
, get_and_update_in
). As we traverse our nested data structure we can think of each level in our traversal as a substructure. Each key gets applied to the substructure for the current point in the traversal. The key determines which portion of the substructure is used for the next key.
Given our original map, if we apply the key list ["Joe", :age]
then we start with the map
%{"Bob" => %{age: 32, weight: 172}, "Joe" => %{age: 30, weight: 187}}
apply the key "Joe"
to get the next substructure
%{age: 30, weight: 187}
then apply the key :age
to get the final substructure, in this case a single value
30
With normal keys like strings or atoms this happens automatically. With function keys, it’s a bit more complicated as our function must handle some of the work of the traversal.
Every function used as a key to update_in
must have the same signature,
fun(:get_and_update, data, next)
The first argument is an atom, :get_and_update
, meaning our function only matches if someone calls it with this as the first argument. nThis is the action that will be passed by update_in
to our function. The reason update_in
passes the action is because we sometimes want to write multiple function clauses to handle different actions, such as :get
for the get_in
function. nYou may be wondering why the action is :get_and_update
instead of just :update
. The reason is that update_in
actually defers tonget_and_update_in
which expects this action.
The second argument, data
, is the substructure for the current point of the traversal. This is the only data our function has to care about – all the data higher up in the traversal has already been handled. The final argument, next
, is a function that we must call on the elements of data
that we want to traverse.
So all key functions for update_in
need to do the same thing, take a substructure and transform it by identifying the elements of interest and calling next
on them. Normally the key function would leave the other elements alone, but this doesn’t have to be the case – we will use this to our advantage later on.
The all
function is given below:
all = fn :get_and_update, data, next ->
data_list = Map.to_list(data)
new_list = Enum.map(data_list, fn {key, value} ->
{_, updated} = next.(value)
{key, updated}
end)
{nil, Enum.into(new_list, %{})}
end
This function is a bit complicated at first glance, so let’s break it down.
On line 1 starting with fn
we define the signature for an anonymous function that matches on three arguments. The first argument must be :get-and_update
, which is the action that update_in
will pass when this function is called.nThe other two arguments, data
and next
, are the current substructure and the function we need to call on the selected elements. We bind (assign) this to the variable all
so we can pass it as one of the keys to update_in
.
The substructure passed to this function will actually be our whole map, since all
is the first key in our list of keys passed to update_in
.nOn line 2 we convert the map to a list. We do this because we want to process every value in the map. Converting it to a list turns this
%{"Bob" => %{age: 32, weight: 172}, "Joe" => %{age: 30, weight: 170}}
into this
[{"Bob", %{age: 32, weight: 175}}, {"Joe", %{age: 30, weight: 170}}]
which is a list of two-element (key/value) tuples.
On lines 3-6 we transform this list into a new list by using map
to call the next
function on every value. This gives us a new list that has our original keys and transformed values.
[{"Bob", %{age: 33, weight: 175}}, {"Joe", %{age: 31, weight: 170}}]
On line 7 we construct a return value for our
all
function that consists of a two element tuple. The first element is nil
. It could be anything because update_in
is going to ignore it. If we passed our all
function to get_and_update_in
then we would want to return the original value, data
, instead of nil
. The second element of the tuple is a new map that we construct by calling Enum.into(new_list, {f915c993dcd57782d8bbdb9f9291c5d4b3b1a7c844f782feb26c4f553a22f754}{})
. Moving back and forth between maps and lists is routine in Elixir.
{nil, %{"Bob" => %{age: 33, weight: 172}, "Joe" => %{age: 31, weight: 170}}}
Again, our function key has one job, transform the current substructure by calling
next
on the elements we care about and leaving the other elements alone.
Let’s put everything we have learned in one final example. Suppose we have a company database represented by the following data structure:
company = %{total_revenue: 10_000_000,
sales: [%{salesperson: "Joe", bonus: 0, accounts: ["XYZ Inc.", "ABC Co.", "McDoogles"]},
%{salesperson: "Bob", bonus: 0, accounts: ["ACME", "Tarjet"]},
%{salesperson: "Jane", bonus: 0, accounts: ["Homely Depot", "Element 84", "Pear Computers"]},
%{salesperson: "Jill", bonus: 0, accounts: ["Four Guys", "Gas 'N Sip'"]}]}
We want to update our the records for our sales people to reflect their quarterly bonuses. The formula for the bonus is given by
begin{equation}
bonus = $1000 times number of accounts
end{equation}
We could calculate everyone’s bonus by calling get_in
for each employee to get their accounts and using the count of accounts to compute their bonus. Then we could call update_in
to set their bonus using the values we calculated. But that sounds like a lot of work. Besides, it means we would have to know the names of all our sales people. Who has time for that?
Instead, we can update all the bonuses in one call to update_in
by exploiting what we know about how function keys work. Normally, a function key is just used as a selector. It chooses which path(s) to follow and eventually when we reach the final element selected by the final key, update_in
applies the transformation function we gave it and things bubble back up. So normally we end up modifying the element selected by our last key and all the other element in our nested structure are left alone.
But remember the primary responsibility of the function key is to transform the substructure that is passed to it. Normally it does this by returning what ever bubbled up from subsequent keys, but it doesn’t have to do this. In fact, we are free to modify the data structure however we want. In this case, we can update the value for the :bonus
key with whatever bubbled up from the subsequent keys.
Our call to update_in
will look like this
update_in company, [:sales, update_bonus, :accounts], fn accounts ->
1000 * Enum.count(accounts)
end
The first key, :sales
, selects the sales substructure. The next key, update_bonus
, is a function key that will be responsible for returning a new substructure will all the bonuses filled in. The last key, :accounts
, selects the list of accounts for each salesperson.
The last argument to update_in
is the transformation function, which operates on the last element in the traversal, in this case the list of accounts for each salesperson. It simply computes a bonus by multiplying the number of accounts by 1000.
The atom keys and the transformation function should be easy enough to understand. The only tricky part is the update_bonus
function key, which is given here
update_bonus = fn :get_and_update, data, next ->
new_data = Enum.map data, fn salesperson ->
{_, updated_salesperson} = next.(salesperson)
bonus = updated_salesperson[:accounts]
put_in(salesperson, [:bonus], bonus)
end
{nil, new_data}
end
Hopefully this is starting to make sense to you, but let’s break it down to be sure. Line 1 is the function signature that we have examined earlier and the assignment to a variable,
update_bonus
, so we can pass it to update_in
. The data
passed to this function will be list of maps of sales people as selected by the :sales
key. Line 2 is going to transform all the elements in the list by mapping over them using Enum.map
.
The function applied to each of the elements takes a single argument, nsalesperson
, which will be a map. The body of the function is given in lines 3-5. It calls the next
function on the salesperson
map and gets back an updated salesperson map, with the bonus computed by the transformation function. If we were to just bubble this value back up we would have a mess since we would be replacing our list of accounts with the bonus. Instead, on line 4 we extract the computed bonus from the updated_map (it’s under :accounts
) and on line 5 we set the current map’s :bonus
key to the calculated bonus. This new map replaces our salesperson
map.
Finally, on line 7, we return our updated list of maps in a two element tuple, using nil
as the first value (remember, update_in
will ignore this value).
Putting it all together, we start with the first key, :sales
, which points this substructure, a list of maps:
[%{salesperson: "Joe", bonus: 0, accounts: ["XYZ Inc.", "ABC Co.", "McDoogles"]},
%{salesperson: "Bob", bonus: 0, accounts: ["ACME", "Tarjet"]},
%{salesperson: "Jane", bonus: 0, accounts: ["Homely Depot", "Element 84", "Pear Computers"]},
%{salesperson: "Jill", bonus: 0, accounts: ["Four Guys", "Gas 'N Sip'"]}]
This list is the structure that is passed to our
update_bonus
function as thendata
parameter.
update_bonus
then calls Enum.map
on this list. The function passed to Enum.map
gets a map (salesperson
) as its argument. It calls next
on this map, which moves on to the next key in our key list, :accounts
. The substructure pointed to by :accounts
is the list of accounts. So for our first salesperson map, it would be this
["XYZ Inc.", "ABC Co.", "McDoogles"]
Since
:accounts
is the last key in the list passed to update_in
, update_in
calls its transformation function on this account list, which looks like this
1000 * Enum.count(["XYZ Inc.", "ABC Co.", "McDoogles"])
=> 3000
Now the computed value begins to bubble back up. So at this point we have
{_, updated_salesperson} = {nil, %{salesperson: "Joe", bonus: 0, accounts: 3000}}
on line 3 of our update_bonus
function.
The computed bonus, 3000, is under the :accounts
key, since this is the last key in our key list. We simply extract it on line 4 with
bonus = updated_salesperson[:accounts]
=> 3000
Now that we have the bonus, we just update the original
salesperson
map the was passed to us on line 5
put_in(salesperson, [:bonus], bonus)
=> %{salesperson: "Joe", bonus: 3000, accounts: ["XYZ Inc.", "ABC Co.", "McDoogles"]}
update_bonus
does this for every salesperson map in the list it received, so finally, we have
update_in company, [:sales, update_bonus, :accounts], fn accounts ->
1000 * Enum.count(accounts)
end
=> {sales: [%{accounts: ["XYZ Inc.", "ABC Co.", "McDoogles"], bonus: 3000, salesperson: "Joe"},
%{accounts: ["ACME", "Tarjet"], bonus: 2000, salesperson: "Bob"},
%{accounts: ["Homely Depot", "Element 84", "Pear Computers"], bonus: 3000, salesperson: "Jane"},
%{accounts: ["Four Guys", "Gas 'N Sip'"], bonus: 2000, salesperson: "Jill"}],
total_revenue: 10000000}
Conclusion
In this series we have seen that familiar tasks such as looping over an array or working with nested data structures can be challenging for programmers moving from an imperative language to a functional one like Elixir; challenging, but not impossible. Once you start to think in terms of transforming data this opens up a lot of possibilities. Hopefully the examples we have worked through here have begun to help you understand how these kinds of problems can be solved with functional approaches. nAnd hopefully I have convinced you that functional approaches are not reserved for academic problems, but are applicable to the kinds of problems you encounter every day – and that you don’t have to be na genius to use them.