Stephen Newey


Developer shock at the utility of JSON Schema

22 February 2015

As a developer who’s spent years working with popular “scripting languages” I’ve developed a certain dislike of XML. Especially it’s formality. It’s all great when you’re an “enterprise” developer using “enterprise” tools to build “enterprise” solutions.

I’m lead to believe Java and .NET take away all the complexity of turning things like SOAP into something useful. Not so in the worlds of PHP, Python, Javascript et al. I await a barage of links to libraries I’m not aware of that do just that.

And then JSON came along. Oh my god. This is just so easy. It reads like Python. My browser can understand it with a single command.

And then some people came along and decided that JSON things need schemas. And I was all like… “urgh! get your dirty XML covered fingers off my JSON!”

Then I started building Kotaka, a deployment system that’s configured entirely in JSON. And I found myself writing code that looked like this:

if 'name' not in kip:
    raise ConfigurationError(
        "Name not specified for Instance Provider in %s",
        kip['_filename']
    )
if type(kip['name']) != unicode:
    raise ConfigurationError(
        "Name is not a string for Instance Provider in %s",
        kip['_filename']
    )

Doing that a few times for many fields quickly became tiresome. Perhaps there’s something to this idea of schemas after all?

I’d been writing example JSON files already, so with a little more work defining them I’ve been able to replace the if-soup with a single line.

from jsonschema import validate

...

validate(kip, KotakaInstanceProviderSchema)

And I’m left with readable files giving an example of the format in the docstring and a self-documented schema that describes it.

"""
{
  "KotakaInstanceProvider": {
    "name": "LocalDocker",
    "provider": "docker",
    "options": {
      "url": "unix://var/run/docker.sock",
      "docker_file": null
    }
  }
}
"""

KotakaInstanceProviderSchema = {
    "$schema": "http://json-schema.org/draft-04/schema#",
    "title": "KotakaInstanceProvider",
    "description": "A configuration for an instance provider",
    "type": "object",
    "properties": {
        "name": {
            "description": "Name of provider configuration",
            "type": "string",
            "not": {
                "enum": ["docker", "ec2", "linode", ],
            },
        },
        "provider": {
            "description": "Kotaka instance provider",
            "enum": ["docker", "ec2", "linode", ],
        },
        "options": {
            "description": "Instance provider-specific configuration",
            "type": "object",
        }
    },
    "required": ["name", "provider", ],
}

if __name__ == '__main__':
    import json
    print(json.dumps(KotakaInstanceProviderSchema))

Any errors picked up by the validator are identified with useful language and references, so usability isn’t hurt and I don’t need to explain every possible error in English with lots of string interpolations.

The final change I’d like to make is adding a test that checks the example in the docstring against the schema below. That way I’ll have to keep my examples accurate.

Tags: development, json, kotaka