Programming

Mutability bites

We all know it bites, but in libraries... it rips your arms apart
May 23, 2018
--
User avatar
Adrian Perez
@blackxored

A few weeks ago, there was a post on Reddit, when the author asked a very interesting question, which I'm paraphrasing as "Are there any concrete examples where mutability has introduced a software bug?".

We all (hopefully) know mutable state is bad, unfortunately it's the standard set by decades of programming. However, as hyped as functional programming is at the current era, it's still not at large, and some principles are not well understood.

In this particular case, we might ask ourselves, what are we gaining by sacrificing performance and readability, are there any examples where things go bad because of it?

The Case for the Phantom Password

A few weeks ago, I was working on the finishing touches for an application. That led us to everything from monitoring, logging, and more cross-cutting concerns.

Being heavily invested in GraphQL and Apollo, we had been logging our queries and mutations ran by our serverless backend. Apollo GraphQL Server allows you to override a logFunction at construction time, that (after some massaging to make it provide concise useful output) satisfied our needs in this regard.

We started with this code, which reduced the output to only include operations, and assigns request ID for correlating requests across "distributed" systems:

const logFunction = obj => {
  if (obj && obj.key && obj.data) {
    console.log(`[${request.id}]`, obj);
  }
};

Simple enough. But... While taking a closer look at our logs, we noticed our CreateUser mutation parameters were being logged, and, as you might have guessed, this included the user's password in plaintext in some scenarios.

Armed with this knowledge, we implemented a basic filter in the simplest possible form, akin to Rails filter params but a simpler version of it. Goal was to prevent this information being displayed in the logs:

const logFunction = obj => {
  if (obj && obj.key && obj.data) {
    if (obj.data && obj.data.input && obj.data.input.password) {
      obj.data.input.password = '<REDACTED>';
    }
    console.log(`[${request.id}]`, obj);
  }
};

We set up to replace the actual password with <REDACTED> so it doesn't show up in our logs. But, can you see the problem with this code? While you can argue about style issues with the nested if or the object access pattern, there is a bigger problem. It mutates obj. More importantly, obj is something we do not control, being provided to us by Apollo.

So, what happens here? You might be thinking is not a big deal, as we were, by operating under the assumption that the object that is provided to this function, is only used within this function. Surely, Apollo would have provided this object to us just for the purpose of logging, and then discarding it. It makes sense, right?

Unfortunately, as the title might have clued you in, that is not the case. Our particular object is passed to our function first, but it's also the same object that eventually ends up in the code path of constructing the actual GraphQL operation. Which means, that if were were to mutate obj (like we did), we would end up changing the mutation parameters.

Even though we figured this one out pretty quickly, it would've been really bad to have every user that gets registered having their password set to <REDACTED>. Clueless, they would have been unable to login, and while this wouldn't happen in our particular system, you can even imagine a "Reset Password" functionality being affected by this, leaving everyone puzzled in the process.

Fix & Final Words

I'm a big proponent of immutable data in applications. Whether persistent immutable data structures fit every application is up for debate. But I hope the concepts of avoiding mutation, treating mutable data as immutable, and programming with pure functions, is something really sinks in, and we see more of.

This becomes specially important when writing libraries, even more so when you're handing objects over to client code. In this particular case, Apollo Server assumed we wouldn't be mutating the object that gets passed to logFunction, we assumed it was only used there, hence bug.

The solution in this case is simple: since we do not have control over library code, we only need to treat mutable data as immutable.

Example follows, where we use Ramda's lenses to return a new copy of our object with a particular set of updates. For simpler cases, you might even go with Object/Rest Spread.

const logFunction = obj => {
  if (obj && obj.key && obj.data) {
    const passwordPath = lensPath(['data', 'input', 'password']);
    const password = view(passwordPath, obj);

    const logObj = password 
      ? set(passwordPath, '<REDACTED>', obj) 
      : obj;
    
    console.log(`[${request.id}]`, logObj);
  }
};

Library authors: even if you don't use immutable data, take special care when handing over objects to client code, don't reuse them, prefer cloning them...

~ EOF ~

Craftmanship Journey
λ
Software Engineering Blog

Stay in Touch


© 2020 Adrian Perez