Using the Facade Pattern to Wrap Third-Party Integrations
Posted by Aly Sivji in Deep Dives
The main idea behind Software Architecture Methodologies such as Clean Architecture and Hexagonal Architecture is to create loosely coupled components that can be organized into layers. This way of writing code leverages the separation of concerns design principle and makes our application easier to maintain, i.e. we can easily modify our code and test it using stubs.
There are many ways we can create systems with layered architecture; one of the more popular techniques is to leverage Structural Design Patterns to create explicit relationships between classes. This post explores how the Facade Pattern can be used to wrap third-party integrations to improve software design.
Note: this is a companion writeup to my PyTexas talk, Everyday Design Patterns: Facade Pattern.
Table of Contents
- What You Need to Follow Along
- Project Description
- Direct Integration Implementation
- Facade Pattern
- Facade Pattern Implementation
- Facade Pattern: Migrate to GitHub GraphQL API
- Conclusion
- Appendix A: Full-Featured Facade
- Appendix B: Testing with VCR.py
What You Need to Follow Along
Language
Code
- Github Repo
- contains a requirements.txt with pinned dependencies
Project Description
We will be creating a changelog generator.
When I cut a new release for software that I own, I include a CHANGELOG that describes all the changes made since the last release.
Below is an example changelog; it contains a list of changes with links to the relevant GitHub Pull Request:
Figure 1. Example Changelog
To simplify our example we'll make some assumptions:
- the master / main branch is protected and all changes need to be made through a Pull Request
- we squash all commits before merging into master; this means each commit in the master / main branch represents one change
The process to generate a changelog is fairly straightforward:
- get the date of the last release using the GitHub API
- get all the commit messages since that date from the GitHub API
- format commit message into a changelog
Direct Integration Implementation
In this section we will walk through our initial implementation of a changelog generator script; this script directly interacts with the GitHub API.
Changelog Script
Our command-line script looks as follows:
# changelog/a_direct_integration.py
import argparse
import requests
def generate_changelog(owner, repo, version):
BASE_URL = f"https://api.github.com/repos/{owner}/{repo}"
# get release date
resp = requests.get(f"{BASE_URL}/releases/tags/{version}")
if resp.status_code == 404:
raise ValueError("Version does not exist")
resp.raise_for_status()
release_dt = resp.json()["published_at"]
# get commit messages
params = {"sha": "master", "since": release_dt}
resp = requests.get(f"{BASE_URL}/commits", params=params)
resp.raise_for_status()
commit_messages = [item.get("commit", {}).get("message") for item in resp.json()]
# format
changelog = ["CHANGELOG", ""]
for message in commit_messages[::-1]:
changelog.append(f"- {message}")
return changelog
def parse_args():
description = "Generate changelog for repository"
parser = argparse.ArgumentParser(description=description)
parser.add_argument(
"-r",
"--repo",
type=str,
help="Full path to repository, (abc/xyz)",
required=True,
)
parser.add_argument(
"-v",
"--version",
type=str,
help="Version to generate CHANGELOG from",
required=True,
)
return vars(parser.parse_args())
if __name__ == "__main__":
args = parse_args()
try:
owner, repo = args["repo"].split("/")
except ValueError:
raise ValueError("Invalid repo")
version = args["version"]
changelog = generate_changelog(owner, repo, version)
print()
print("\n".join(changelog))
We can run this script as follows:
$ python changelog/a_direct_integration.py -r busy-beaver-dev/busy-beaver -v 2.9.0
CHANGELOG
- Merge dictionaries using new operator in Python 3.9 (#336)
Notes
- used requests to interact with the GitHub API
- used argparse to capture and parse command-line arguments
Testing Script
To figure out what / how to test, we need to understand our current workflow.
Figure 2. Diagram of Changelog Script workflow: script interacts with the GitHub API.
The GitHub API is an external dependency that adds complexity to our testing process. It makes our tests slow as we have the additional overhead of making API requests across the internet. Also, what if GitHub goes down? The tests which depend on GitHub are going to fail. That doesn't make a lot of sense.
This is why we replace our dependency on the GitHub API with a stub that returns canned responses.
Figure 3. Diagram of Changelog Script workflow for tests: script interacts with the GitHub API Stub.
In Python, we can use the responses library to create and return canned responses for interactions made using the requests library. We can specify the JSON to return when a specified endpoint is hit with a known HTTP verb. Stubbing out external dependencies also makes our tests determinstic.
Our tests look as follows:
# tests/test_a_direct_integration.py
import responses
from changelog.a_direct_integration import generate_changelog
@responses.activate
def test_generate_changelog():
# Arrange -- created canned responses
responses.add(
responses.GET,
"https://api.github.com/repos/owner/repo/releases/tags/1.0.0",
json={"published_at": "2020-01-26"},
)
responses.add(
responses.GET,
"https://api.github.com/repos/owner/repo/commits",
json=[
{"commit": {"message": "last commit"}},
{"commit": {"message": "first commit"}},
],
)
# Act
changelog = generate_changelog("owner", "repo", "1.0.0")
# Assert
assert changelog == ["CHANGELOG", "", "- first commit", "- last commit"]
To run our test:
$ pytest tests/test_a_direct_integration.py
================== test session starts ==================
platform darwin -- Python 3.9.0, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/alysivji/siv-dev/siv-scripts/clean-architecture--facade-pattern
collected 1 item
tests/test_a_direct_integration.py . [100%]
=================== 1 passed in 0.12s ===================
Problem with Current Approach
The implementation above works,
but it couples our code
to something we do not control.
If there is a change to our external dependency,
we will have to update the generate_changelog
function.
Possible changes include:
- having to modify our code if there is a GitHub API version upgrade
- rewriting our entire integration logic if we move our project to GitLab
Neither of these changes would affect our actual business logic, but we would still have to modify our code as it is tightly coupled with the integration.
This is where the Facade Pattern comes in. The Facade Pattern helps us separate parts of our code that change from parts of our code that stay the same.
Facade Pattern
Provides a unified interface to a set of interfaces in a subsystem. The Facade defines a higher-level interface that makes the subsystem easier to use.
- Head First Design Patterns
In this section, we will discuss the Facade Pattern.
Real World Example
(This example from Head First Design Patterns)
Imagine we have a home theatre system with many different components: TV, cable box, receiver, BluRay player, and some lights. Each of the components of this home theatre system has its own remote we can use to interact with it.
Figure 4. Remote for each component of a home theatre system.
Or we can program a universal remote with a simple interface. This remote will be our Facade to our home theatre system.
Figure 5. Universal remote with a simple interface
We can interact with our interface:
- The "Watch TV" button can turn on your TV and cable box
- The "Watch a DVD" button can turn on your TV and BluRay player, also dim your lights
If we need to access advanced features of any of our devices, we can use its supplied remote. But for most use cases, the universal remote does what we need.
Class Diagram
We can visualize the Facade Pattern using the following diagram:
Figure 6. Facade Pattern Class Diagram
In the above diagram, we have multiple clients interacting with a complex subsystem through a Facade.
Use Cases
We can use the Facade Pattern to:
Wrap Third-Party Integrations
Third-party integrations (libraries, APIs, SDKs) are general-purpose tools designed to solve many different types of problems. Usually, we only require a small subset of the functionality a library provides.
We can use the Facade Pattern to "wrap" our integration and only expose the functionality we require.
If our clients requires additional functionality from a third-party integration, we can expand the interface of our Facade for that use case. Our abstraction starts to leak if clients start bypassing the Facade.
Break Apart a Monolith
We can use the Facade Pattern to move from a monolith to microservices. Once we know the functionality we are migrating into a new service, we place this logic inside of a Facade. Then we rewrite our monolith's business logic to use the Facade.
Then when we are ready, we can replace method calls inside of the Facade with calls to another service via an API or by putting tasks on a queue.
Benefits of the Facade Pattern
Using the Facade Pattern provides the following benefits:
Reduces interface of 3rd party integrations
Usually, we only require a small subset of functionality from third-party libraries. We can use the Facade Pattern to simplify a library's interface to only the subset we require.
This can also improve our code's readability. Instead of directly integrating dependencies using each library's API, we can write business logic in the language of our problem domain.
Weak Coupling
Our clients do not need to know about the underlying implementation of the integration. They only need to know the integration's interface: function names, what parameters it takes, what it sends back.
We can change the implementation of the integration and our clients wouldn't know as long as the interface stayed the same. Another way to say this is: we "program to interfaces, not to implementations" .
Separation of concerns
We abstract parts of our code that change, from parts of our code that stay the same. This allows us to develop and test each component independently.
Test by replacing each component boundary with a value
Just like we stubbed out the GitHub API, we can stub out each boundary and unit test our component. There is a great talk by Gary Bernhardt that explores this topic in a lot more depth.
Facade Pattern Implementation
We will refactor our previous script using the Facade Pattern. To do this we need to wrap all the logic associated with the GitHub API in a class.
Another way to say this is: we want to encapsulate the GitHub API into a higher-order abstraction that we can use in our business logic.
Changelog Script
Our updated command-line script looks as follows:
# changelog/b_facade.py
import argparse
import requests
BASE_URL = "https://api.github.com"
def generate_changelog(owner, repo, version):
github = GitHubClient()
release_dt = github.get_release_date(owner, repo, version)
commit_messages = github.get_commit_messages(owner, repo, release_dt)
changelog = ["CHANGELOG", ""]
for message in commit_messages:
changelog.append(f"- {message}")
return changelog
class GitHubClient:
"""Facade around GitHub REST API"""
def get_release_date(self, owner, repo, version):
url = f"{BASE_URL}/repos/{owner}/{repo}/releases/tags/{version}"
resp = requests.get(url)
if resp.status_code == 404:
raise ValueError("Version does not exist")
resp.raise_for_status()
return resp.json()["published_at"]
def get_commit_messages(self, owner, repo, release_dt):
url = f"{BASE_URL}/repos/{owner}/{repo}/commits"
params = {"sha": "master", "since": release_dt}
resp = requests.get(url, params=params)
resp.raise_for_status()
messages = [item.get("commit", {}).get("message") for item in resp.json()]
return messages[::-1]
def parse_args():
description = "Generate changelog for repository"
parser = argparse.ArgumentParser(description=description)
parser.add_argument(
"-r",
"--repo",
type=str,
help="Full path to repository, (abc/xyz)",
required=True,
)
parser.add_argument(
"-v",
"--version",
type=str,
help="Version to generate CHANGELOG from",
required=True,
)
return vars(parser.parse_args())
if __name__ == "__main__":
args = parse_args()
try:
owner, repo = args["repo"].split("/")
except ValueError:
raise ValueError("Invalid repo")
version = args["version"]
changelog = generate_changelog(owner, repo, version)
print()
print("\n".join(changelog))
We can run this script as follows:
$ python changelog/b_facade.py -r busy-beaver-dev/busy-beaver -v 2.9.0
CHANGELOG
- Merge dictionaries using new operator in Python 3.9 (#336)
Notes
- this is a simple Facade that retrieves information from public GitHub repos
- using sessions can improve performance, see Appendix A
Testing Script
To test the above script,
we need to use responses as we did before.
We will also need to test the generate_changelog
driver function
which interacts with the Facade to create a changelog.
This looks as follows:
# tests/test_b_facade.py
from unittest import mock
import responses
from changelog.b_facade import generate_changelog, GitHubClient
@responses.activate
def test_github_client_get_release_date():
responses.add(
responses.GET,
"https://api.github.com/repos/owner/repo/releases/tags/1.0.0",
json={"published_at": "2020-01-26"},
)
github = GitHubClient()
release_dt = github.get_release_date("owner", "repo", "1.0.0")
assert release_dt == "2020-01-26"
@responses.activate
def test_github_client_get_commit_messages():
responses.add(
responses.GET,
"https://api.github.com/repos/owner/repo/commits",
json=[
{"commit": {"message": "last commit"}},
{"commit": {"message": "first commit"}},
],
)
github = GitHubClient()
messages = github.get_commit_messages("owner", "repo", "release_dt")
assert messages == ["first commit", "last commit"]
class GitHubClientStub:
def __init__(self, commit_messages=None):
self.commit_messages = commit_messages
self.mock = mock.Mock()
def get_release_date(self, *args, **kwargs):
self.mock(*args, **kwargs)
def get_commit_messages(self, *args, **kwargs):
self.mock(*args, **kwargs)
return self.commit_messages
@mock.patch("changelog.b_facade.GitHubClient")
def test_generate_changelog(github_mock):
commit_messages = ["first commit", "last commit"]
github_mock.return_value = GitHubClientStub(commit_messages)
messages = generate_changelog("owner", "repo", "1.0.0")
assert messages == ["CHANGELOG", "", "- first commit", "- last commit"]
To run our test:
$ pytest tests/test_b_facade.py
================== test session starts ==================
platform darwin -- Python 3.9.0, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/alysivji/siv-dev/siv-scripts/clean-architecture--facade-pattern
collected 3 items
tests/test_b_facade.py ... [100%]
=================== 3 passed in 0.11s ===================
Notes
- in addition to replacing our GitHub integration boundary with a stub, we also replaced our internal integration boundary with a value
- this way of development allows us write robust tests; we can write loads of unit tests to make sure each component works as expected
Facade Pattern: Migrate to GitHub GraphQL API
Now that we wrapped the GitHub API, let's explore how to refactor the underlying implementation in the Facade without changing business logic.
Throughout this post, we have been interacting with GitHub using the REST API interface. In this section, we will be migrating our integration to use the GraphQL API.
There are many videos that describe what GraphQL is and how to use it, but that's beyond the scope of what we need to know. For our purposes, GraphQL is a query language that retrieves the exact data we ask for. Instead of having to parse through large JSON blobs, we can make requests to get the exact information we need.
Changelog Script
Our refactored integration looks as follows:
# changelog/c_graphyql.py
import os
import argparse
from sgqlc.endpoint.requests import RequestsEndpoint
GITHUB_TOKEN = os.getenv("GITHUB_TOKEN", None)
BASE_URL = "https://api.github.com"
def generate_changelog(owner, repo, version):
github = GitHubClient(GITHUB_TOKEN)
release_dt = github.get_release_date(owner, repo, version)
commit_messages = github.get_commit_messages(owner, repo, release_dt)
changelog = ["CHANGELOG", ""]
for message in commit_messages:
changelog.append(f"- {message}")
return changelog
class GitHubClient:
"""Facade around GitHub GraphQL API"""
def __init__(self, oauth_token):
headers = {"Authorization": f"Bearer {GITHUB_TOKEN}"}
self.endpoint = RequestsEndpoint("https://api.github.com/graphql", headers)
def get_release_date(self, owner, repo, tag):
query = """
query findReleaseDt($owner: String!, $repo: String!, $tag: String!) {
repository(owner: $owner, name: $repo) {
release(tagName: $tag) {
publishedAt
}
}
}
"""
variables = {"owner": owner, "repo": repo, "tag": tag}
data = self.endpoint(query, variables)
try:
return data["data"]["repository"]["release"]["publishedAt"]
except TypeError: # returns {"release": None} if tag does not exist
raise ValueError("Version does not exist")
def get_commit_messages(self, owner, repo, release_dt):
query = """
query commitsSinceDt($owner: String!, $repo: String!, $branch: String!, $since_dt: GitTimestamp) {
repository(owner: $owner, name: $repo) {
object(expression: $branch) {
... on Commit {
history(since: $since_dt) {
nodes {
messageHeadline
}
}
}
}
}
}
""" # noqa
variables = {
"owner": owner,
"repo": repo,
"branch": "master",
"since_dt": release_dt,
}
data = self.endpoint(query, variables)
if "errors" in data:
# loop thru this: data["errors"][0]["message"]
raise ValueError()
commits = data["data"]["repository"]["object"]["history"]["nodes"]
commit_messages = [commit["messageHeadline"] for commit in commits]
return commit_messages[::-1]
def parse_args():
description = "Generate changelog for repository"
parser = argparse.ArgumentParser(description=description)
parser.add_argument(
"-r",
"--repo",
type=str,
help="Full path to repository, (abc/xyz)",
required=True,
)
parser.add_argument(
"-v",
"--version",
type=str,
help="Version to generate CHANGELOG from",
required=True,
)
return vars(parser.parse_args())
if __name__ == "__main__":
args = parse_args()
try:
owner, repo = args["repo"].split("/")
except ValueError:
raise ValueError("Invalid repo")
version = args["version"]
changelog = generate_changelog(owner, repo, version)
print()
print("\n".join(changelog))
We can run this script as follows:
$ python changelog/c_graphql.py -r busy-beaver-dev/busy-beaver -v 2.9.0
CHANGELOG
- Merge dictionaries using new operator in Python 3.9 (#336)
Notes
- used Simple GraphQL Client to interact with GitHub's GraphQL endpoint
- created a GitHub Personal Access Token as it is required to make GraphQL queries
Discussion
Notice that the only change we made was to our GitHub integration, our actual business logic stayed the same. This is exactly what we should expect because our business logic doesn't care if we use the GitHub REST API or the GitHub GraphQL API.
It treats the GitHub integration like a black box. As long as the integration's interface stays the same, our code will work as expected.
To complete this task, we will need to update our contract tests. I will leave this as an exercise for the reader. Appendix B walks through an API testing strategy that records requests and responses.
Conclusion
In this post, we wrapped a third-party integration using the Facade Pattern. This results in loosely coupled code that is easy to maintain and even easier to test.
In the Appendices below, we will build a full-featured Facade and show an easy way to test API integrations.
Additional Resources
- Freeman, Eric & Robson, Elizabeth. (2004). Head First Design Patterns: A Brain-Friendly Guide. 1st ed. Sebastopol, CA: O’Reilly Media
- “Gang of Four”. (1994). Design Patterns: Elements of Reusable Object-Oriented Software. 1st ed. Boston, MA: Addison-Wesley Professional
- Gary Bernhardt: Boundaries
- Martin, Robert. (2017). Clean Architecture. 1st ed. Upper Saddle River, NJ: Prentice Hall
Appendix A: Full-Featured Facade
Our running example created a simple Facade to demonstrate concepts without additional overhead. While the code does work, it's not something we would use in production.
To create a proper abstraction around the GitHub API we need the following:
- request.Sessions to improve performance
- set HTTP headers (
Content-Type
,User-Agent
,Accept
, etc) to be a good citizen of the web - HTTP Basic Authentication using a GitHub Access Token with repo permissions
- will allow us to access private repos
Implementation
# changelog/d_full_featured_facade.py
import os
import argparse
import requests
GITHUB_TOKEN = os.getenv("GITHUB_TOKEN", None)
BASE_URL = "https://api.github.com"
def generate_changelog(owner, repo, version):
github = GitHubClient(GITHUB_TOKEN)
release_dt = github.get_release_date(owner, repo, version)
commit_messages = github.get_commit_messages(owner, repo, release_dt)
changelog = ["CHANGELOG", ""]
for message in commit_messages:
changelog.append(f"- {message}")
return changelog
class GitHubClient:
def __init__(self, oauth_token):
headers = {
"User-Agent": "Change Log",
"Accept": "application/vnd.github.v3+json",
"Authorization": f"token {oauth_token}",
"Content-Type": "application/json",
}
session = requests.session()
session.headers.update(headers)
self.session = session
def get_release_date(self, owner, repo, version):
url = f"{BASE_URL}/repos/{owner}/{repo}/releases/tags/{version}"
resp = self.session.get(url)
if resp.status_code == 404:
raise ValueError("Version does not exist")
resp.raise_for_status()
return resp.json()["published_at"]
def get_commit_messages(self, owner, repo, release_dt):
url = f"{BASE_URL}/repos/{owner}/{repo}/commits"
params = {"sha": "master", "since": release_dt}
resp = self.session.get(url, params=params)
resp.raise_for_status()
messages = [item.get("commit", {}).get("message") for item in resp.json()]
return messages[::-1]
def parse_args():
description = "Generate changelog for repository"
parser = argparse.ArgumentParser(description=description)
parser.add_argument(
"-r",
"--repo",
type=str,
help="Full path to repository, (abc/xyz)",
required=True,
)
parser.add_argument(
"-v",
"--version",
type=str,
help="Version to generate CHANGELOG from",
required=True,
)
return vars(parser.parse_args())
if __name__ == "__main__":
args = parse_args()
try:
owner, repo = args["repo"].split("/")
except ValueError:
raise ValueError("Invalid repo")
version = args["version"]
changelog = generate_changelog(owner, repo, version)
print()
print("\n".join(changelog))
Appendix B: Testing with VCR.py
We used the responses library to stub out an external API in order to create determinstic tests.. While this method does work, it requires us to manually construct each response payload.
An alternative approach to testing is
to utilize VCR.py.
VCR.py records requests and responses
and save them to disk as yaml
files; these files are called cassettes.
When we run our tests, VCR.py will
use cassettes to replay the recorded requests and responses.
This approach deserves its own post but that's beyond the scope of this essay.
Implementation
We need to replace the Authorization header
which contains our GitHub Access Token
with a dummy value to ensure secrets do not get saved in our cassettes.
With pytest, we can add the following snippet in our conftest.py
:
# conftest.py
import pytest
@pytest.fixture(scope="session")
def vcr_config():
"""Overwrite headers where key can be leaked"""
return {
"filter_headers": [("authorization", "DUMMY")],
}
Our tests will look as follows:
# tests/test_vcrpy.py
import os
from unittest import mock
import pytest
from changelog.d_full_featured_facade import generate_changelog, GitHubClient
class GitHubClientStub:
def __init__(self, commit_messages=None):
self.commit_messages = commit_messages
self.mock = mock.Mock()
def get_release_date(self, *args, **kwargs):
self.mock(*args, **kwargs)
def get_commit_messages(self, *args, **kwargs):
self.mock(*args, **kwargs)
return self.commit_messages
@mock.patch("changelog.d_full_featured_facade.GitHubClient")
def test_generate_changelog(github_mock):
commit_messages = ["first commit", "last commit"]
github_mock.return_value = GitHubClientStub(commit_messages)
messages = generate_changelog("owner", "repo", "1.0.0")
assert messages == ["CHANGELOG", "", "- first commit", "- last commit"]
@pytest.mark.vcr(cassette_library_dir="tests/cassettes/rest")
def test_github_client_get_release_date():
GITHUB_TOKEN = os.getenv("GITHUB_TOKEN", None)
github = GitHubClient(GITHUB_TOKEN)
release_dt = github.get_release_date("busy-beaver-dev", "busy-beaver", "1.3.2")
assert release_dt == "2020-01-26T19:04:10Z"
@pytest.mark.vcr(cassette_library_dir="tests/cassettes/rest")
def test_github_client_get_commit_messages():
GITHUB_TOKEN = os.getenv("GITHUB_TOKEN", None)
github = GitHubClient(GITHUB_TOKEN)
release_dt = "2020-01-25T19:04:10Z"
messages = github.get_commit_messages("busy-beaver-dev", "busy-beaver", release_dt)
assert "Update to Python 3.9 (#335)" in messages
Comments