A functional style leads to simpler code
Functional programming (FP) has been picking up steam those last few years. It is now the de facto standard for front-end programming with frameworks such as React, which leverages a lot of its concepts (higher-order components, immutability, reducers, avoiding object-oriented programming constructs such inheritance, etc.).
As a pragmatic programmer who's not an expert in FP, I'm taking more and more inspiration from FP to write code in my day-to-day languages (Python and JavaScript/TypeScript, mostly). While those languages are not pure FP languages (e.g., they don't have strict support for immutability, currying by default), they are sufficiently versatile and pragmatic that you can absolutely write using a functional style.
Object-oriented patterns to avoid
OOP (object-oriented programming) has its use cases. If you find yourself having to reinvent concepts from OOP with functions, you're probably doing it wrong and you should probably reuse battle-tested OOP patterns instead.
That being said, there are a number of OOP patterns and style to avoid. Here are two of them:
- Avoid inheritance
- Avoid modifying instance state
Avoid inheritance
Avoiding inheritance is a pretty old idea, notably popularized by the Gang of Four book (Design Patterns: Elements of Reusable Object-Oriented Software published in 1994), but for some reason it still very present in lots of frameworks.
The React documentation explains it pretty well in why composition makes more sense than inheritance for UI.
Multiple layers of inheritance leads to code that is very complicated to follow. This is even more the case when you use abstract classes and have some method to override and some with default implementation. Over more than one layer of inheritance, understanding the flow is a high-cognitive load effort.
Let's say you have this "abstract" class:
class AbstractToaster:
def set_heat(self):
raise NotImplementedError()
def heat(self):
self.set_heat(4)
def toast(self):
self.heat()
self.wait(4)
self.set_heat(0)
And this concrete implementation:
class Toaster(AbstractToaster):
def set_heat(self):
...
toaster = Toaster()
toaster.toast()
In toaster.py
it is quite impossible to understand what's going on, and you'll have to open abstract_toaster.py
to understand the flow. This is fine with one layer and two files. It becomes a nightmare when there are multiple levels, a larger number of more complicated methods, and multiple superclasses.
Avoid modifying instance state
Let's take this extreme example which creates a class to sum up list:
class Adder:
def __init__(self, values: list[int]):
self.values = values
self._current_value = None
def get_sum(self) -> int:
"""Get the sum of a list."""
self._sum = 0
for value in self.values:
self._current_value = value
self._add()
return self._sum
def _add(self) -> None:
"""Add the current value to the running sum."""
self._sum += self._current_value
result = Adder([1, 1]).get_sum()
assert result == 2
Granted - this is an extreme example. But you might see a similar structure with state stored on the instance, modified and queried in different places. This code will prove very complicated to test, because you need to take into account the order in which things are supposed to be called.
Avoiding global variables is well accepted. They make code difficult to reason about and to test. Note that the previous example is actually very similar to global state:
_CURRENT_SUM = None
def get_sum(values: list[int]) -> int:
"""Get the sum of a list."""
global _CURRENT_SUM
_CURRENT_SUM = 0
for value in values:
_CURRENT_SUM += value
return _CURRENT_SUM
result = get_sum([1, 1])
assert result == 2
What are the issues with modifying instance state?
- It's more difficult to read: because the state
self._sum
comes from outside the function, so that's one more thing your brain has to keep track of. - It's more difficult to follow: there are no guarantee that
self._sum
is not modified somewhere else. In a real world example with subclasses, this becomes a nightmare. - It's more difficult to test: you have to setup the
Adder
instance before testing. In a real world example, this might take a few lines.
Take inspiration from FP constructs
You can take advantages of FP even in languages that are not considered pure FP languages. Even though you won't get strict guarantees, it will help you write more readable & testable code.
Limiting and isolating side effects
A function is said to have side effects if it causes observable changes in a system's state in addition to (or instead of) its return value.
def no_side_effect(name: str) -> str:
return f"hello, {name}"
# This function prints to screen, which is a side effect
def with_side_effect(name: str) -> None:
print(f"hello, {name}")
Another way to think about side effects is testing. If asserting the return value of the function is sufficient to validate its core behavior, then it does not have any side effect:
# To check the function's behavior, we only need to check its return value
def test_no_side_effet():
assert no_side_effect("Louis") == "hello, Louis"
# In this test we have just checked the return value, but we still need to
# check that it has printed something on the screen. This test is incomplete.
def test_with_side_effet():
assert with_side_effect("Louis") is None
Here are some examples of side effects:
- Reading/writing the database
- Reading/writing a file
- Printing something on the screen
- Requesting an HTTP service
- Getting feedback from the user
Evidently, you must have side effects at some point. So the key idea here is to cleanly set aside code that has side effect, from pure functional code.
If you follow the Domain Driven Design mindset, this can be done by having all side effects in specialized layers, such as Repository, Gateway, etc. Then you can strive for side effect-free domain services.
Side-effect free functions are easier to test, easier to read and easier to debug, so they should be preferred whenever you can. Try to isolate the side effects as much as possible:
Let's say we have a list of image urls and we want to filter out the images that are too small. Here's a version that does not cleanly isolate side effects:
import requests
MIN_SIZE = 1024 * 1024
def filter_out_small_images(image_urls: list[str]) -> list[str]:
returned = []
for url in image_urls:
# this makes an HTTP request
res = requests.get(url)
assert res.status_code == 200
if len(res.content) > MIN_SIZE:
returned.append(url)
return returned
Testing this function is going to be annoying, because we are making HTTP requests in the middle of it. Since we should avoid external dependencies in unit tests, we'll have to mock out the HTTP request, which will involve a bit of code. This function accepts a list, so it might make multiple HTTP calls that we'll have to mock - which is even more complicated.
Another thing we can observe is that this code is not as readable as it could be. The true business logic (filtering out small images) is hidden in the middle of HTTP request logic.
So the first step is to create a get_size
function that has the side effect.
We still need to wire this with the rest of the logic. Note that as an added
benefit, get_size
is more generic and can be reused for other URLs.
import requests
def get_size(url: str) -> int:
"""Get an URL's size."""
res = requests.get(url)
assert res.status_code == 200
return len(res.content)
Embrace a declarative style
While imperative code tells the computer what to do, declarative code tells it what we want. An imperative style usually mixes implementation details with core business logic, while declarative usually expresses pure intent.
The best example of declarative code is SQL. When we write SQL, we don't tell it how to return results, we just tells it what we want:
select brand, count(id) from products where price > 100 group by brand order by brand
Unfortunately we can't (yet) write code like this - but we can use a similar style, where we isolate the imperative code and write business logic using a higher level language.
If you look at filter_out_small_images
, we use an imperative style: we create
an empty list, we iterate over image URLs and we append to the list when the
criteria is met. Luckily there are two FP constructs we can use to make this
more declarative:
map(func, iterable)
returns an iterablefunc(iterable[0]), func(iterable[1]), ...
filter(func, iterable)
returns an iterable for whichfunc(item)
returns true.
import requests
MIN_SIZE = 1024 * 1024
def get_size(url: str) -> int:
"""Get an URL's size."""
res = requests.get(url)
assert res.status_code == 200
return len(res.content)
def is_large(url: str) -> bool:
return get_size(url) > MIN_SIZE
def filter_out_small_images(image_urls: list[str]) -> list[str]:
# We use list() because filter() returns an iterable, not a list
return list(filter(is_large, image_urls))
If your programming language supports list comprehension, you can use them for such a simple example (personal taste!):
def filter_out_small_images(image_urls: list[str]) -> list[str]:
return [url for url in image_urls if is_large(url)]
As you can see our code is getting even more cleanly separated, with small functions doing simple things that are easy to test. We still have to deal with the side effect though.
This code is also less imperative since we use filter
, which
guarantees that we are manipulating iterables and returning iterables. Once you
are used to map
, filter
, reduce
, etc. you immediately know what kind of
guarantees you get. For instance with filter
:
- The input must be an iterable.
- The output is an iterable.
- The output is at most the same length as the input iterable.
- The items in the output were all in the input.
All those guarantees conspire to make your code easier to reason about, just by
using filter
. Compare to this simplistic example:
def filter_out_small_images(image_urls: list[str]) -> list[str]:
# It seems we're going to return a list
returned = []
for url in image_urls:
if get_size(res.content) > MIN_SIZE:
# We are appending the URL if it is large enough
returned.append(url)
return returned
Using the imperative style, you have to actually read the code to understand what's going on.
Use bags of data
It's better to have 100 functions operate on one data structure than 10 functions on 10 datastructures -- Alan Perlis (quoted in Clojure for the brave and true by Daniel Higginbotham)
As we have stated, we are still doing side effects in the middle of the business logic. It would be nice if we could separate them more cleanly, and have filter manipulate simple bags of data. For the sake of the example, if you are used to OOP, you might want to write things like this:
import requests
class ImageUrl:
def __init__(self, url: str):
self.url = url
def get_size(self)
res = requests.get(self.url)
assert res.status_code == 200
return len(res.content)
def is_large(url: ImageUrl) -> bool:
return url.get_size() > MIN_SIZE
def filter_out_small_images(image_urls: list[ImageUrl]) -> list[ImageUrl]:
return list(filter(is_large, image_urls))
This is even worse than what we had in the beginning:
- Our
get_size
logic is tied to theImageUrl
object and can't be reused. - We are making an HTTP request inside an object (side effect).
For such a simple example, it might look fine, but we can do better. Look at the following example:
from dataclasses import dataclass
@dataclass
class UrlWithSize:
url: str
size: int
def get_url_with_size(url: str) -> UrlWithSize:
size = get_size(url)
return UrlWithSize(url, size)
To be continued...
Consider this a short introduction to FP! This shows that FP leads to simpler code that is easier to test. Happy coding!
- Simple functions that can be composed. Each function can be tested in isolation.
- Classes do have their use cases in certain languages. It's usually a good idea to keep inheritance to one level only.
- FP is stricter, e.g.
map
guarantees same number of values - Functions require less ceremony than OOP
References
- Charles Scalfani, Goodbye, Object Oriented Programming
- Matthew Gerstman, Functional Programming Fundamentals
- Robert C. Martin, OO vs FP