zavod Data Factory
This page contains documentation for the data processing framework used by OpenSanctions. It's called zavod
(name) in order to distinguish it from some of the other software components used by the OpenSanctions project.
zavod
provides a runtime context and a set of helpers for running crawler scripts that capture data from any online source, convert it to the followthemoney
data model, store the output and eventually produce the export files used by OpenSanctions' data consumers.
Getting started
- Installation of zavod on your machine
- Tutorial: how to add a crawler
- Metadata: how to write excellent dataset metadata
Further references
- Data inclusion criteria - what data will be included in OpenSanctions?
- What is an entity? - intro to the notion of data entities.
- FollowTheMoney data model documentation, and the OpenSanctions data dictionary (we use a subset of the schemata offered by FollowTheMoney).
- Statement-based data model - the data model used by
zavod
during processing.