Global.Church Developer Portal
For Data Managers

Data Quality and SHACL

Every piece of data that enters the Global.Church knowledge graph is validated against a set of rules before it is accepted. These rules are defined using SHACL -- the Shapes Constraint Language. This guide explains what SHACL does, what it checks, and how to understand validation errors.


What is SHACL?

SHACL (pronounced "shackle") is a W3C standard for defining rules that data must follow. Think of it as a quality checklist: before data enters the knowledge graph, SHACL checks that it meets the requirements.

Examples of rules SHACL can enforce:

  • "Every Organization must have a name."
  • "An Organization can have at most one website URL."
  • "The organization type must be a valid concept from the approved list."

These rules are called shapes because they define the expected "shape" of valid data.


What We Validate

The Global.Church SHACL shapes check three categories:

Required Fields

Certain properties must be present. For example, every Organization must have:

  • A name (rdfs:label) -- at least one
  • An organization type (gc:hasOrganizationType) -- at least one, must be a valid SKOS concept

Value Types

Properties must contain the right kind of data:

  • gc:website must be a valid URI (not just a text string)
  • gc:email must be a string
  • gc:phone must be a string
  • Dates must be valid date/datetime values

Cardinality

Some properties can only appear once per entity:

  • An Organization can have at most one website (sh:maxCount 1)
  • An Organization can have at most one email (sh:maxCount 1)
  • But an Organization can have multiple organization types

Example Shape

Here is a simplified version of the Organization shape, showing what it checks:

Code
:OrganizationShape a sh:NodeShape ; sh:targetClass gc:Organization ; # Required: every org must have a name sh:property [ sh:path rdfs:label ; sh:minCount 1 ; sh:description "Organization name (required)." ; ] ; # Required: at least one organization type sh:property [ sh:path gc:hasOrganizationType ; sh:class skos:Concept ; sh:minCount 1 ; sh:message "Every Organization must have at least one gc:hasOrganizationType." ; ] ; # Optional but validated: website must be a URI sh:property [ sh:path gc:website ; sh:datatype xsd:anyURI ; sh:maxCount 1 ; ] ; # Optional but validated: email must be a string sh:property [ sh:path gc:email ; sh:datatype xsd:string ; sh:maxCount 1 ; ] .

Reading this shape: "Any entity typed as gc:Organization must have at least one label and at least one organization type. If it has a website, there can be only one and it must be a valid URI."

Church Shape

Churches have their own SHACL shape that targets gc:Church directly (using sh:targetClass). This validates church-specific properties that don't apply to other organizations:

Code
:ChurchOrganizationShape a sh:NodeShape ; sh:targetClass gc:Church ; # Church-specific: multi-campus flag sh:property [ sh:path gc:isMultiCampus ; sh:datatype xsd:boolean ; sh:maxCount 1 ; sh:severity sh:Info ; sh:message "Churches should indicate multi-campus status." ; ] ; # Church-specific: services schedule sh:property [ sh:path gc:servicesInfo ; sh:datatype xsd:string ; sh:maxCount 1 ; sh:severity sh:Info ; ] .

Since gc:Church is a subclass of gc:Organization, church instances are validated by both the OrganizationShape (name and type required) and the ChurchOrganizationShape (church-specific properties). Shapes compose automatically — you don't need to duplicate constraints.


When Validation Runs

SHACL validation runs at a specific point in the data lifecycle:

  1. Data is submitted via the ingest API (POST /v0/ingest).
  2. SHACL validation runs against the submitted data.
  3. If all shapes pass, the data is loaded into the organization's named graph in GraphDB.
  4. If any shape fails, the submission is rejected with a 422 Unprocessable Entity response and a list of violations.

This means invalid data never enters the knowledge graph. Validation acts as a gate, not a post-hoc audit.


Understanding Errors

When validation fails, the API returns a report listing each violation. Here is an example:

JSONCode
{ "error": "SHACL validation failed", "violations": [ { "focusNode": "https://data.global.church/org/my-church", "path": "http://www.w3.org/2000/01/rdf-schema#label", "message": "Organization name (required).", "severity": "Violation" }, { "focusNode": "https://data.global.church/org/my-church", "path": "https://ontology.global.church/core#hasOrganizationType", "message": "Every Organization must have at least one gc:hasOrganizationType.", "severity": "Violation" } ] }

Each violation tells you:

  • focusNode -- which entity has the problem
  • path -- which property is missing or invalid
  • message -- a human-readable description of the rule that was broken
  • severity -- Violation (must fix), Warning (should fix), or Info (recommendation)

Severity Levels

Not all shape constraints are equally strict:

SeverityMeaningEffect
ViolationData does not meet a required constraintSubmission rejected
WarningData is missing a recommended propertySubmission accepted with warnings
InfoData is missing an optional but helpful propertySubmission accepted, informational only

For example, having a location (gc:hasLocation) is marked as Info severity -- it is helpful but not required. Having a name (rdfs:label) is a Violation -- it is mandatory.


Common Issues

Missing required name. Every Organization must have an rdfs:label. If your data uses a different property for the name (e.g., just gc:orgName without rdfs:label), add the label.

Wrong data type for URLs. The gc:website property expects an xsd:anyURI value. If you provide a plain string like "example.com", it may fail. Use a full URI like "https://example.com".

Missing organization type. Every Organization needs at least one gc:hasOrganizationType linking to a concept from the OrganizationTypeScheme (Church, MissionAgency, Denomination, Network, etc.).

Invalid classification links. Properties like gc:hasBeliefClassification must point to valid SKOS concepts that exist in the graph. Check the vocabulary graphs for valid concept URIs.


Next Steps

Last modified on