Skip to content

Specification

Version 0.1 — Draft Status: Proposal · Profile of Google Open Knowledge Format (OKF) v0.1 Model: Defined entirely in LinkML (lokf.yaml); all other artifacts are generated from it.

LOKF is a semantic, ontology-grounded profile of the Google Open Knowledge Format (OKF). It keeps everything that makes OKF pleasant — a directory of markdown files, each with a small YAML frontmatter block describing one concept, readable with cat and diffable in git — and adds the one thing OKF v0.1 deliberately leaves out: formal meaning. In LOKF, every field, type, and relationship is bound to an established web vocabulary (schema.org, W3C DCAT, W3C PROV-O), so that a bundle of markdown files is simultaneously human-readable prose and, once a generated JSON-LD @context is attached, valid JSON-LD that expands losslessly to RDF triples.

The whole format is specified as a single LinkML schema. From that one source we generate the JSON-LD context, a JSON Schema (validation), SHACL shapes (RDF-graph validation), and an OWL ontology (reasoning). Nothing in this document is hand-maintained twice.


OKF v0.1 is intentionally minimal: the only required field is type, links between concepts are untyped markdown links, and “what types exist” is left entirely to the producer. That minimalism is a feature for hand-authoring, but it leaves three things on the table:

  1. No shared meaning across producers. Two organizations both writing type: Metric have no guarantee the field means the same thing, and an agent consuming both cannot merge them into one graph.
  2. Untyped relationships. An OKF link asserts that concept A relates to concept B, but not how. “Depends on”, “is part of”, and “was derived from” all look identical.
  3. No path to a knowledge graph. OKF is a document bundle, not RDF. You cannot query it with SPARQL, reason over it with OWL, or validate it with SHACL.

LOKF closes those gaps without breaking OKF’s authoring model. A LOKF concept file is still an OKF concept file. The semantics ride along in the frontmatter and in a published @context, so humans keep writing markdown while machines get a graph.

  1. Preserve 100% of OKF’s ergonomics: markdown + YAML frontmatter, one concept per file, permissive consumption.
  2. Bind every concept, field, and relationship to schema.org / DCAT / PROV so a bundle is expressible as JSON-LD and RDF with no separate serialization step.
  3. Provide a typed relationship vocabulary so links carry meaning.
  4. Define the format once in LinkML and generate every downstream artifact.
  5. Remain bidirectionally compatible with OKF (see §10).
  • Replacing schema.org/JSON-LD for public web pages (that is a different layer).
  • Prescribing storage, serving, or query infrastructure.
  • Mandating a closed taxonomy — the type set is extensible, and unknown types are tolerated exactly as in OKF.

Terms inherited from OKF (Knowledge Bundle, Concept, Concept ID, Frontmatter, Body, Link, Citation) keep their OKF meaning. LOKF adds:

  • Concept IRI — the concept’s stable RDF identity. By convention it is the bundle base IRI joined with the Concept ID. It is the JSON-LD @id and the subject of every triple the concept produces.
  • Base IRI — declared once in the bundle-root index.md; the namespace that turns relative Concept IDs into absolute Concept IRIs.
  • Context — the JSON-LD @context (generated from the LinkML schema) that maps frontmatter keys to IRIs. Attaching it to a concept’s frontmatter yields JSON-LD.
  • Typed relation — a frontmatter key whose value is another concept and whose RDF predicate is fixed by this spec (e.g. derivedFromprov:wasDerivedFrom).

3. Design principle: one model, many artifacts

Section titled “3. Design principle: one model, many artifacts”

LinkML is the single source of truth. lokf.yaml defines the classes, slots, enumerations, and their mappings to external vocabularies. Every other artifact in the LOKF package is generated from it and MUST NOT be edited by hand:

Artifact Generated by Purpose
lokf.context.jsonld gen-jsonld-context* Turns concept frontmatter into JSON-LD / RDF.
lokf.schema.json gen-json-schema Validates concept frontmatter (JSON Schema).
lokf.shacl.ttl gen-shacl Validates the resulting RDF graph (SHACL).
lokf.owl.ttl gen-owl Class/property ontology for reasoning & alignment.

* The published context additionally aliases OKF’s type field to the JSON-LD @type keyword and id to @id, so that authoring in plain OKF frontmatter is enough to produce correctly-typed Linked Data (see §7.3).

Because meaning lives in the model, adding a field or a type is a one-line change in lokf.yaml; the context, schema, shapes, and ontology all re-derive.


Identical to OKF §3. A bundle is a directory tree of markdown files; index.md and log.md remain reserved; distribution as a git repo is recommended. LOKF adds two optional keys to the bundle-root index.md frontmatter (the one place OKF already permits frontmatter in an index):

---
lokf_version: "0.1" # LOKF version this bundle targets
okf_version: "0.1" # OKF version it remains compatible with
base_iri: https://acme.example/knowledge/ # resolves Concept IDs to Concept IRIs
context: https://w3id.org/lokf/context.jsonld # the @context to attach to concepts
title: Acme Knowledge Bundle
description: Canonical, agent-readable knowledge for Acme's data org.
license: https://creativecommons.org/licenses/by/4.0/
publisher:
type: Organization
id: https://acme.example
name: Acme Corp
---

A consumer that ignores these keys sees a perfectly ordinary OKF bundle. A semantic consumer uses base_iri + context to lift the whole bundle into RDF.


Every concept is a UTF-8 markdown file: a YAML frontmatter block followed by a markdown body, exactly as in OKF. LOKF specifies what the frontmatter keys mean by mapping each to an RDF property.

type is the only required field (as in OKF). All others are optional.

Field OKF RDF property (slot_uri) Range Notes
type rdf:type (via @type) class Required. Names a LOKF class (§6).
id @id (subject) IRI Concept IRI. Defaults to base_iri + Concept ID.
title schema:name string close: dcterms:title, rdfs:label
description schema:description string close: dcterms:description
resource schema:url IRI The underlying asset. close: dcat:landingPage, prov:specializationOf
tags schema:keywords string* close: dcat:keyword
timestamp schema:dateModified dateTime exact: dcterms:modified
created schema:dateCreated dateTime exact: dcterms:created
version schema:version string
license schema:license IRI
author schema:author Agent* close: dcterms:creator, prov:wasAttributedTo
body schema:text string The markdown after the frontmatter.
citations schema:citation Citation*

(* = multivalued.) Producers MAY add any other keys; consumers MUST preserve unknown keys and MUST NOT reject documents that carry them (OKF §4.1).

5.2 Typed relationships — LOKF’s core upgrade

Section titled “5.2 Typed relationships — LOKF’s core upgrade”

Where OKF has one untyped link, LOKF provides a set of named relation fields, each pinned to an RDF predicate. Values are Concept IRIs (or Concept IDs resolved against base_iri). All are optional and multivalued.

Field RDF predicate (slot_uri) Meaning
isPartOf dcterms:isPartOf This concept is part of the target.
hasPart schema:hasPart The target is part of this concept.
references dcterms:references This concept refers to the target.
dependsOn dcterms:requires This concept depends on the target.
derivedFrom prov:wasDerivedFrom Provenance: derived from the target.
about schema:about Subject matter of this concept.
sameAs schema:sameAs Same entity as the target (close owl:sameAs).
relatedTo dcterms:relation Generic association.
definedBy rdfs:isDefinedBy A resource that formally defines this.
source dcterms:source Sourced/derived from the target.

For predicates outside this set, use the generic relations field — a list of reified Relation objects, each a predicate (drawn from the RelationType vocabulary, e.g. joinsWith, wasAttributedTo) plus a target:

relations:
- predicate: joinsWith
target: https://acme.example/knowledge/tables/customers
relation_label: "join on customer_id"

Human-facing markdown links in the body (OKF §5) remain valid and encouraged; the typed fields are the machine-readable layer that carries the kind of link.

Unchanged from OKF §4.2. Standard markdown, structural headings preferred. The conventional headings # Schema, # Examples, and # Citations retain their OKF meaning. The body is mapped to schema:text in the RDF projection.


A concept’s type SHOULD name one of the following classes. Each maps to a public ontology term; consumers MUST tolerate unknown values by treating the concept as a generic lokf:Concept (OKF §4.1 / §9).

type Class IRI (@type) Aligned to
(abstract) lokf:Concept broad: schema:CreativeWork, prov:Entity
Dataset schema:Dataset exact: dcat:Dataset
Table lokf:Table is-a Dataset; close dcat:Dataset
Metric lokf:Metric close: schema:Observation, skos:Concept
Service schema:WebAPI close: schema:SoftwareApplication
Playbook lokf:Playbook exact: schema:HowTo
Policy lokf:Policy close: schema:DigitalDocument
GlossaryTerm schema:DefinedTerm exact: skos:Concept
Reference lokf:Reference close: schema:CreativeWork, schema:WebPage
Document lokf:Document close: schema:DigitalDocument
Person schema:Person exact: foaf:Person, prov:Person
Organization schema:Organization exact: foaf:Organization, prov:Organization

Type-specific fields are available on the relevant classes:

  • Dataset / Tablefields (a list of Field: name, datatype [FieldType → XSD], description, unit, is_key, constraints); distribution (dcat:Distribution).
  • Metricunit (schema:unitText), formula (lokf:formula), measures (lokf:measures, → a Concept).
  • Serviceendpoint (schema:url), http_method, documentation.
  • GlossaryTermdefinition (skos:definition), abbreviation (schema:alternateName).

The complete, authoritative definitions — including value objects Field, Distribution, Relation, Citation, and Agent/Person/Organization — are in lokf.yaml.


This is the mechanism that makes LOKF both “markdown-friendly” and “RDF-native”.

7.1 The identity: frontmatter + context = JSON-LD

Section titled “7.1 The identity: frontmatter + context = JSON-LD”

A JSON-LD document is just JSON plus an @context that maps its keys to IRIs. LOKF’s frontmatter keys are precisely the LinkML slots, and the generated lokf.context.jsonld maps each of them to its slot_uri. Therefore:

concept frontmatter (YAML) + lokf.context.jsonld = JSON-LD
↓ expand
RDF triples

No new syntax, no parallel file. The author writes OKF; the context supplies the meaning.

metrics/weekly-active-users.md (abridged frontmatter):

---
type: Metric
id: https://acme.example/knowledge/metrics/weekly-active-users
title: Weekly Active Users
unit: users
tags: [growth, engagement]
timestamp: 2026-06-30T12:00:00Z
author:
- type: Person
id: https://acme.example/people/jsmith
name: Jordan Smith
measures: [ https://acme.example/knowledge/glossary/active-user ]
derivedFrom: [ https://acme.example/knowledge/tables/user-events ]
dependsOn: [ https://acme.example/knowledge/glossary/active-user ]
---

Attaching the context and expanding yields (Turtle, abridged):

@prefix lokf: <https://w3id.org/lokf/> .
@prefix schema: <http://schema.org/> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<…/metrics/weekly-active-users>
a lokf:Metric ;
schema:name "Weekly Active Users" ;
schema:unitText "users" ;
schema:keywords "growth", "engagement" ;
schema:dateModified "2026-06-30T12:00:00Z"^^xsd:dateTime ;
schema:author <…/people/jsmith> ;
lokf:measures <…/glossary/active-user> ;
prov:wasDerivedFrom <…/tables/user-events> ;
dcterms:requires <…/glossary/active-user> .
<…/people/jsmith>
a schema:Person ;
schema:name "Jordan Smith" .

The type: Metric field became rdf:type lokf:Metric; the typed relations became prov:, dcterms:, and lokf: predicates pointing at other concepts’ IRIs.

The published context makes exactly two changes on top of the raw LinkML output, both standard JSON-LD keyword aliasing, so that unmodified OKF frontmatter behaves as Linked Data:

  • type@type — OKF’s required field designates the RDF class.
  • id@id — the concept’s IRI is the RDF subject.

Everything else (title, derivedFrom, tags, …) maps to its ontology property directly from the model.


A bundle is LOKF v0.1 conformant if:

  1. It is a conformant OKF v0.1 bundle (OKF §9): every non-reserved .md file has parseable YAML frontmatter with a non-empty type.
  2. Every type value that names a LOKF class (§6) is used consistently with that class’s mappings; unknown types are permitted and treated as lokf:Concept.
  3. The bundle-root index.md declares base_iri and context if the bundle is to be consumed as Linked Data. (A bundle without them is still LOKF-conformant, but is consumed as plain OKF.)
  4. Typed relation fields (§5.2), when present, use the predicates defined here.

As in OKF, consumers MUST be permissive: missing optional fields, unknown type values, unknown frontmatter keys, and broken cross-links MUST NOT cause rejection.


Two independent, generated validators are available:

  • JSON Schema (lokf.schema.json) validates a concept’s frontmatter (or a whole bundle serialized against the KnowledgeBundle root) before RDF projection.
  • SHACL (lokf.shacl.ttl) validates the RDF graph after projection, catching cardinality, datatype, and range violations at the triple level.

The reference bundle in examples/ passes JSON Schema validation for all six concepts and for the assembled KnowledgeBundle.


LOKF ⟷ OKF is bidirectional.

  • Every LOKF bundle is a valid OKF bundle. The semantic layer lives in optional frontmatter keys and an external context; strip them and you have OKF.
  • Every OKF bundle is a valid LOKF bundle with default interpretation: each type maps to lokf:Concept (or a matching class if the string happens to match), and untyped markdown links are treated as dcterms:relation. Adopting LOKF is therefore incremental — add ids and typed relations only where they earn their keep.

Layering with the wider ecosystem (following OKF’s own framing):

Format Reader Job LOKF’s relation
schema.org / JSON-LD Search/answer engines Public-page understanding, rich results LOKF reuses its vocabulary.
DCAT / PROV-O Data-catalog & provenance tools Dataset description, lineage LOKF binds datasets & lineage to them.
OKF Your own agents Canonical internal knowledge bundle LOKF is a semantic profile of it.
llms.txt Web crawlers Navigate public content Orthogonal; unchanged.

LOKF’s contribution is to make an internal OKF bundle queryable as a knowledge graph using the same vocabularies the public web already speaks.


LOKF versions are <major>.<minor>, tracking OKF’s scheme. A minor bump adds backward-compatible fields, types, relation predicates, or mappings; a major bump may rename required fields or change reserved filenames. Bundles declare their target with lokf_version in the root index.md. Because the format is defined in LinkML, a version is exactly a tagged lokf.yaml, and the context/schema/shapes/OWL for that version are reproducible by regeneration.


lokf.yaml The LinkML schema — the single source of truth.
lokf.context.jsonld Generated JSON-LD context (+ type/id aliases). Attach to concepts.
lokf.schema.json Generated JSON Schema for frontmatter/bundle validation.
lokf.shacl.ttl Generated SHACL shapes for RDF-graph validation.
lokf.owl.ttl Generated OWL ontology for reasoning/alignment.
examples/acme-knowledge/ A conformant reference bundle (6 concepts).
examples/*.nt RDF triples produced from the example frontmatter.
README.md How the pieces fit and how to regenerate them.
Prefix Namespace
lokf https://w3id.org/lokf/
schema http://schema.org/
dcat http://www.w3.org/ns/dcat#
dcterms http://purl.org/dc/terms/
prov http://www.w3.org/ns/prov#
skos http://www.w3.org/2004/02/skos/core#
foaf http://xmlns.com/foaf/0.1/
rdfs http://www.w3.org/2000/01/rdf-schema#
owl http://www.w3.org/2002/07/owl#
xsd http://www.w3.org/2001/XMLSchema#

LOKF v0.1 is a draft profile and is not affiliated with or endorsed by Google. “Open Knowledge Format” and “OKF” refer to the format published by Google Cloud; LOKF builds on it under its open terms.