Erlang: find cross-app calls using xref
3 min read

Erlang: find cross-app calls using xref

Using xref magic to query compiled beam files and find cross-application function calls in Erlang
Erlang: find cross-app calls using xref

At work, we use the multi-app project pattern to organize our codebase. This lets us track everything in a single repository but still keep things isolated.

For isolation, we wanted to restrict apps to only be able to call the public interfaces of other apps (facade pattern). However, since everything in Erlang is in a global namespace, there is nothing preventing code in one app calling the (exported) functions from another app.

Next best solution—detect the above scenario and raise warnings during code review/CI.

Xref to the rescue:

Xref is a cross reference tool that can be used for finding dependencies between functions, modules, applications and releases.

Xref includes some predefined analysis patterns that perform some common tasks like searching for undefined functions, deprecated function calls, unused exported functions, etc.

How it works: when xref server is started and some modules/applications/releases are added for analysis, it builds a Call Graph: a directed graph data structure containing the calls between functions, modules, applications or releases. It also creates an Inter Call Graph which holds information about indirect calls (chain of calls). It exposes a very powerful query language, which can be used to extract any information we want from the above graph data structures.

To demonstrate this, I created a sample multi-app repository: library_sample. There are some cross-app function calls in this code that we want to detect.

This repo is supposed to represent the functionality of a physical Library. It has four apps: library, library_api, library_catalog, and library_inventory. library_catalog has metadata about the books in the library, library_inventory has information about the availability of books, return dates, etc., library_api has HTTP handlers which call the above, and library is the main app which brings it all together.

Let’s say we want that library_api can call library_catalog and library_inventory functions, but catalog and inventory cannot call each other directly.

First, we clone the repo and run rebar3 shell:

> git clone https://github.com/srijan/library_sample
Cloning into 'library_sample'...
remote: Enumerating objects: 29, done.
remote: Counting objects: 100% (29/29), done.
remote: Compressing objects: 100% (19/19), done.
remote: Total 29 (delta 3), reused 29 (delta 3), pack-reused 0
Unpacking objects: 100% (29/29), 910.62 KiB | 2.53 MiB/s, done.

> cd library_sample

> ./rebar3 shell
===> Verifying dependencies...
===> Analyzing applications...
===> Compiling library_inventory
===> Compiling library_catalog
===> Compiling library
===> Compiling library_api
Erlang/OTP 23 [erts-11.1.7] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe]

Eshell V11.1.7  (abort with ^G)
1>

Then, we start xref and add our build directory for analysis:

1> xref:start(s).
{ok,<0.185.0>}

2> xref:add_directory(s, "_build/default/lib", [{recurse, true}]).
{ok,[library_api,library_app,library_catalog,
     library_inventory,library_sample_app,library_sample_sup,
     library_sup]}

Using xref:q/2 for querying the constructed call graph:

3> xref:q(s, "E | library_inventory || library_catalog").
{ok,[]}

4> xref:q(s, "E | library_catalog || library_inventory").
{ok,[{{library_catalog,get_by_id,1},
      {library_inventory,get_available_copies,1}}]}

This means that there are no direct calls from the library_inventory application to the library_catalog application. But, there is a direct call from library_catalog:get_by_id/1 to library_inventory:get_available_copies/1.

The query E | library_catalog || library_inventory can be read as:

  • E = All Call Graph Edges
  • | = The subset of calls from any of the vertices. So | library_catalog creates a subset which contains calls from the library_catalog app.
  • || = The subset of calls to any of the vertices. So, || library_inventory further creates a subset of the previous subset which contains calls to the library_inventory app.

To get both direct and indirect calls, closure E has to be used:

5> xref:q(s, "closure E | library_catalog || library_inventory").
{ok,[{{library_catalog,get_by_id,1},
      {library_inventory,get_all,0}},
     {{library_catalog,get_by_id,1},
      {library_inventory,get_available_copies,1}}]}

This tells us that there is an indirect direct call from  library_catalog:get_by_id/1 to library_inventory:get_all/0.

The query language is very powerful, and there are more interesting examples in the xref user’s guide.

But this only runs the required queries manually in Erlang shell. We want to be able to run it in continuous integration. Luckily, rebar3 comes with a way to specify custom xref queries to run when running ./rebar3 xref, and to raise an error if they don’t match against the expected value defined.

Here’s the xref section from my rebar.config:

{xref_queries, [
                {"closure E | library_catalog || library_inventory", []},
                {"closure E | library_inventory || library_catalog", []}
               ]}.

This performs the two queries I want and matches them against the the target value of []. Sample output:

> ./rebar3 xref
===> Verifying dependencies...
===> Analyzing applications...
===> Compiling library_inventory
===> Compiling library_catalog
===> Compiling library
===> Compiling library_api
===> Running cross reference analysis...
===> Query closure E | library_catalog || library_inventory
 answer []
 did not match [{{library_catalog,get_by_id,1},{library_inventory,get_all,0}},
                {{library_catalog,get_by_id,1},
                 {library_inventory,get_available_copies,1}}]

So, now this is ready for automation.

Follow me on Twitter