Specification
Version 0.1 — Draft
Status: Proposal · Profile of Google Open Knowledge Format (OKF) v0.1
Model: Defined entirely in LinkML (lokf.yaml); all other artifacts are generated from it.
LOKF is a semantic, ontology-grounded profile of the Google Open Knowledge
Format (OKF). It keeps everything that makes OKF pleasant — a directory of
markdown files, each with a small YAML frontmatter block describing one concept,
readable with cat and diffable in git — and adds the one thing OKF v0.1
deliberately leaves out: formal meaning. In LOKF, every field, type, and
relationship is bound to an established web vocabulary (schema.org, W3C DCAT,
W3C PROV-O), so that a bundle of markdown files is simultaneously human-readable
prose and, once a generated JSON-LD @context is attached, valid JSON-LD that
expands losslessly to RDF triples.
The whole format is specified as a single LinkML schema. From that one source we generate the JSON-LD context, a JSON Schema (validation), SHACL shapes (RDF-graph validation), and an OWL ontology (reasoning). Nothing in this document is hand-maintained twice.
1. Motivation
Section titled “1. Motivation”OKF v0.1 is intentionally minimal: the only required field is type, links
between concepts are untyped markdown links, and “what types exist” is left
entirely to the producer. That minimalism is a feature for hand-authoring, but it
leaves three things on the table:
- No shared meaning across producers. Two organizations both writing
type: Metrichave no guarantee the field means the same thing, and an agent consuming both cannot merge them into one graph. - Untyped relationships. An OKF link asserts that concept A relates to concept B, but not how. “Depends on”, “is part of”, and “was derived from” all look identical.
- No path to a knowledge graph. OKF is a document bundle, not RDF. You cannot query it with SPARQL, reason over it with OWL, or validate it with SHACL.
LOKF closes those gaps without breaking OKF’s authoring model. A LOKF concept
file is still an OKF concept file. The semantics ride along in the frontmatter and
in a published @context, so humans keep writing markdown while machines get a
graph.
- Preserve 100% of OKF’s ergonomics: markdown + YAML frontmatter, one concept per file, permissive consumption.
- Bind every concept, field, and relationship to schema.org / DCAT / PROV so a bundle is expressible as JSON-LD and RDF with no separate serialization step.
- Provide a typed relationship vocabulary so links carry meaning.
- Define the format once in LinkML and generate every downstream artifact.
- Remain bidirectionally compatible with OKF (see §10).
Non-goals
Section titled “Non-goals”- Replacing schema.org/JSON-LD for public web pages (that is a different layer).
- Prescribing storage, serving, or query infrastructure.
- Mandating a closed taxonomy — the type set is extensible, and unknown types are tolerated exactly as in OKF.
2. Terminology
Section titled “2. Terminology”Terms inherited from OKF (Knowledge Bundle, Concept, Concept ID, Frontmatter, Body, Link, Citation) keep their OKF meaning. LOKF adds:
- Concept IRI — the concept’s stable RDF identity. By convention it is the
bundle base IRI joined with the Concept ID. It is the JSON-LD
@idand the subject of every triple the concept produces. - Base IRI — declared once in the bundle-root
index.md; the namespace that turns relative Concept IDs into absolute Concept IRIs. - Context — the JSON-LD
@context(generated from the LinkML schema) that maps frontmatter keys to IRIs. Attaching it to a concept’s frontmatter yields JSON-LD. - Typed relation — a frontmatter key whose value is another concept and whose
RDF predicate is fixed by this spec (e.g.
derivedFrom→prov:wasDerivedFrom).
3. Design principle: one model, many artifacts
Section titled “3. Design principle: one model, many artifacts”LinkML is the single source of truth. lokf.yaml defines the classes, slots,
enumerations, and their mappings to external vocabularies. Every other artifact in
the LOKF package is generated from it and MUST NOT be edited by hand:
| Artifact | Generated by | Purpose |
|---|---|---|
lokf.context.jsonld |
gen-jsonld-context* |
Turns concept frontmatter into JSON-LD / RDF. |
lokf.schema.json |
gen-json-schema |
Validates concept frontmatter (JSON Schema). |
lokf.shacl.ttl |
gen-shacl |
Validates the resulting RDF graph (SHACL). |
lokf.owl.ttl |
gen-owl |
Class/property ontology for reasoning & alignment. |
* The published context additionally aliases OKF’s type field to the JSON-LD
@type keyword and id to @id, so that authoring in plain OKF frontmatter is
enough to produce correctly-typed Linked Data (see §7.3).
Because meaning lives in the model, adding a field or a type is a one-line change
in lokf.yaml; the context, schema, shapes, and ontology all re-derive.
4. Bundle structure
Section titled “4. Bundle structure”Identical to OKF §3. A bundle is a directory tree of markdown files; index.md
and log.md remain reserved; distribution as a git repo is recommended. LOKF adds
two optional keys to the bundle-root index.md frontmatter (the one place OKF
already permits frontmatter in an index):
---lokf_version: "0.1" # LOKF version this bundle targetsokf_version: "0.1" # OKF version it remains compatible withbase_iri: https://acme.example/knowledge/ # resolves Concept IDs to Concept IRIscontext: https://w3id.org/lokf/context.jsonld # the @context to attach to conceptstitle: Acme Knowledge Bundledescription: Canonical, agent-readable knowledge for Acme's data org.license: https://creativecommons.org/licenses/by/4.0/publisher: type: Organization id: https://acme.example name: Acme Corp---A consumer that ignores these keys sees a perfectly ordinary OKF bundle. A
semantic consumer uses base_iri + context to lift the whole bundle into RDF.
5. Concept documents
Section titled “5. Concept documents”Every concept is a UTF-8 markdown file: a YAML frontmatter block followed by a markdown body, exactly as in OKF. LOKF specifies what the frontmatter keys mean by mapping each to an RDF property.
5.1 Core frontmatter fields
Section titled “5.1 Core frontmatter fields”type is the only required field (as in OKF). All others are optional.
| Field | OKF | RDF property (slot_uri) |
Range | Notes |
|---|---|---|---|---|
type |
✅ | rdf:type (via @type) |
class | Required. Names a LOKF class (§6). |
id |
@id (subject) |
IRI | Concept IRI. Defaults to base_iri + Concept ID. |
|
title |
✅ | schema:name |
string | close: dcterms:title, rdfs:label |
description |
✅ | schema:description |
string | close: dcterms:description |
resource |
✅ | schema:url |
IRI | The underlying asset. close: dcat:landingPage, prov:specializationOf |
tags |
✅ | schema:keywords |
string* | close: dcat:keyword |
timestamp |
✅ | schema:dateModified |
dateTime | exact: dcterms:modified |
created |
schema:dateCreated |
dateTime | exact: dcterms:created |
|
version |
schema:version |
string | ||
license |
schema:license |
IRI | ||
author |
schema:author |
Agent* | close: dcterms:creator, prov:wasAttributedTo |
|
body |
✅ | schema:text |
string | The markdown after the frontmatter. |
citations |
schema:citation |
Citation* |
(* = multivalued.) Producers MAY add any other keys; consumers MUST preserve
unknown keys and MUST NOT reject documents that carry them (OKF §4.1).
5.2 Typed relationships — LOKF’s core upgrade
Section titled “5.2 Typed relationships — LOKF’s core upgrade”Where OKF has one untyped link, LOKF provides a set of named relation fields,
each pinned to an RDF predicate. Values are Concept IRIs (or Concept IDs resolved
against base_iri). All are optional and multivalued.
| Field | RDF predicate (slot_uri) |
Meaning |
|---|---|---|
isPartOf |
dcterms:isPartOf |
This concept is part of the target. |
hasPart |
schema:hasPart |
The target is part of this concept. |
references |
dcterms:references |
This concept refers to the target. |
dependsOn |
dcterms:requires |
This concept depends on the target. |
derivedFrom |
prov:wasDerivedFrom |
Provenance: derived from the target. |
about |
schema:about |
Subject matter of this concept. |
sameAs |
schema:sameAs |
Same entity as the target (close owl:sameAs). |
relatedTo |
dcterms:relation |
Generic association. |
definedBy |
rdfs:isDefinedBy |
A resource that formally defines this. |
source |
dcterms:source |
Sourced/derived from the target. |
For predicates outside this set, use the generic relations field — a list of
reified Relation objects, each a predicate (drawn from the RelationType
vocabulary, e.g. joinsWith, wasAttributedTo) plus a target:
relations: - predicate: joinsWith target: https://acme.example/knowledge/tables/customers relation_label: "join on customer_id"Human-facing markdown links in the body (OKF §5) remain valid and encouraged; the typed fields are the machine-readable layer that carries the kind of link.
5.3 Body
Section titled “5.3 Body”Unchanged from OKF §4.2. Standard markdown, structural headings preferred. The
conventional headings # Schema, # Examples, and # Citations retain their OKF
meaning. The body is mapped to schema:text in the RDF projection.
6. The type vocabulary
Section titled “6. The type vocabulary”A concept’s type SHOULD name one of the following classes. Each maps to a public
ontology term; consumers MUST tolerate unknown values by treating the concept as a
generic lokf:Concept (OKF §4.1 / §9).
type |
Class IRI (@type) |
Aligned to |
|---|---|---|
| (abstract) | lokf:Concept |
broad: schema:CreativeWork, prov:Entity |
Dataset |
schema:Dataset |
exact: dcat:Dataset |
Table |
lokf:Table |
is-a Dataset; close dcat:Dataset |
Metric |
lokf:Metric |
close: schema:Observation, skos:Concept |
Service |
schema:WebAPI |
close: schema:SoftwareApplication |
Playbook |
lokf:Playbook |
exact: schema:HowTo |
Policy |
lokf:Policy |
close: schema:DigitalDocument |
GlossaryTerm |
schema:DefinedTerm |
exact: skos:Concept |
Reference |
lokf:Reference |
close: schema:CreativeWork, schema:WebPage |
Document |
lokf:Document |
close: schema:DigitalDocument |
Person |
schema:Person |
exact: foaf:Person, prov:Person |
Organization |
schema:Organization |
exact: foaf:Organization, prov:Organization |
Type-specific fields are available on the relevant classes:
- Dataset / Table —
fields(a list ofField:name,datatype[FieldType→ XSD],description,unit,is_key,constraints);distribution(dcat:Distribution). - Metric —
unit(schema:unitText),formula(lokf:formula),measures(lokf:measures, → a Concept). - Service —
endpoint(schema:url),http_method,documentation. - GlossaryTerm —
definition(skos:definition),abbreviation(schema:alternateName).
The complete, authoritative definitions — including value objects Field,
Distribution, Relation, Citation, and Agent/Person/Organization — are
in lokf.yaml.
7. From markdown to RDF
Section titled “7. From markdown to RDF”This is the mechanism that makes LOKF both “markdown-friendly” and “RDF-native”.
7.1 The identity: frontmatter + context = JSON-LD
Section titled “7.1 The identity: frontmatter + context = JSON-LD”A JSON-LD document is just JSON plus an @context that maps its keys to IRIs.
LOKF’s frontmatter keys are precisely the LinkML slots, and the generated
lokf.context.jsonld maps each of them to its slot_uri. Therefore:
concept frontmatter (YAML) + lokf.context.jsonld = JSON-LD ↓ expand RDF triplesNo new syntax, no parallel file. The author writes OKF; the context supplies the meaning.
7.2 Worked example
Section titled “7.2 Worked example”metrics/weekly-active-users.md (abridged frontmatter):
---type: Metricid: https://acme.example/knowledge/metrics/weekly-active-userstitle: Weekly Active Usersunit: userstags: [growth, engagement]timestamp: 2026-06-30T12:00:00Zauthor: - type: Person id: https://acme.example/people/jsmith name: Jordan Smithmeasures: [ https://acme.example/knowledge/glossary/active-user ]derivedFrom: [ https://acme.example/knowledge/tables/user-events ]dependsOn: [ https://acme.example/knowledge/glossary/active-user ]---Attaching the context and expanding yields (Turtle, abridged):
@prefix lokf: <https://w3id.org/lokf/> .@prefix schema: <http://schema.org/> .@prefix prov: <http://www.w3.org/ns/prov#> .@prefix dcterms: <http://purl.org/dc/terms/> .@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<…/metrics/weekly-active-users> a lokf:Metric ; schema:name "Weekly Active Users" ; schema:unitText "users" ; schema:keywords "growth", "engagement" ; schema:dateModified "2026-06-30T12:00:00Z"^^xsd:dateTime ; schema:author <…/people/jsmith> ; lokf:measures <…/glossary/active-user> ; prov:wasDerivedFrom <…/tables/user-events> ; dcterms:requires <…/glossary/active-user> .
<…/people/jsmith> a schema:Person ; schema:name "Jordan Smith" .The type: Metric field became rdf:type lokf:Metric; the typed relations became
prov:, dcterms:, and lokf: predicates pointing at other concepts’ IRIs.
7.3 The two aliases
Section titled “7.3 The two aliases”The published context makes exactly two changes on top of the raw LinkML output, both standard JSON-LD keyword aliasing, so that unmodified OKF frontmatter behaves as Linked Data:
type→@type— OKF’s required field designates the RDF class.id→@id— the concept’s IRI is the RDF subject.
Everything else (title, derivedFrom, tags, …) maps to its ontology property
directly from the model.
8. Conformance
Section titled “8. Conformance”A bundle is LOKF v0.1 conformant if:
- It is a conformant OKF v0.1 bundle (OKF §9): every non-reserved
.mdfile has parseable YAML frontmatter with a non-emptytype. - Every
typevalue that names a LOKF class (§6) is used consistently with that class’s mappings; unknown types are permitted and treated aslokf:Concept. - The bundle-root
index.mddeclaresbase_iriandcontextif the bundle is to be consumed as Linked Data. (A bundle without them is still LOKF-conformant, but is consumed as plain OKF.) - Typed relation fields (§5.2), when present, use the predicates defined here.
As in OKF, consumers MUST be permissive: missing optional fields, unknown type
values, unknown frontmatter keys, and broken cross-links MUST NOT cause rejection.
9. Validation
Section titled “9. Validation”Two independent, generated validators are available:
- JSON Schema (
lokf.schema.json) validates a concept’s frontmatter (or a whole bundle serialized against theKnowledgeBundleroot) before RDF projection. - SHACL (
lokf.shacl.ttl) validates the RDF graph after projection, catching cardinality, datatype, and range violations at the triple level.
The reference bundle in examples/ passes JSON Schema validation for all six
concepts and for the assembled KnowledgeBundle.
10. Relationship to OKF and other formats
Section titled “10. Relationship to OKF and other formats”LOKF ⟷ OKF is bidirectional.
- Every LOKF bundle is a valid OKF bundle. The semantic layer lives in optional frontmatter keys and an external context; strip them and you have OKF.
- Every OKF bundle is a valid LOKF bundle with default interpretation: each
typemaps tolokf:Concept(or a matching class if the string happens to match), and untyped markdown links are treated asdcterms:relation. Adopting LOKF is therefore incremental — addids and typed relations only where they earn their keep.
Layering with the wider ecosystem (following OKF’s own framing):
| Format | Reader | Job | LOKF’s relation |
|---|---|---|---|
| schema.org / JSON-LD | Search/answer engines | Public-page understanding, rich results | LOKF reuses its vocabulary. |
| DCAT / PROV-O | Data-catalog & provenance tools | Dataset description, lineage | LOKF binds datasets & lineage to them. |
| OKF | Your own agents | Canonical internal knowledge bundle | LOKF is a semantic profile of it. |
| llms.txt | Web crawlers | Navigate public content | Orthogonal; unchanged. |
LOKF’s contribution is to make an internal OKF bundle queryable as a knowledge graph using the same vocabularies the public web already speaks.
11. Versioning
Section titled “11. Versioning”LOKF versions are <major>.<minor>, tracking OKF’s scheme. A minor bump adds
backward-compatible fields, types, relation predicates, or mappings; a major bump
may rename required fields or change reserved filenames. Bundles declare their
target with lokf_version in the root index.md. Because the format is defined in
LinkML, a version is exactly a tagged lokf.yaml, and the context/schema/shapes/OWL
for that version are reproducible by regeneration.
Appendix A — Package contents
Section titled “Appendix A — Package contents”lokf.yaml The LinkML schema — the single source of truth.lokf.context.jsonld Generated JSON-LD context (+ type/id aliases). Attach to concepts.lokf.schema.json Generated JSON Schema for frontmatter/bundle validation.lokf.shacl.ttl Generated SHACL shapes for RDF-graph validation.lokf.owl.ttl Generated OWL ontology for reasoning/alignment.examples/acme-knowledge/ A conformant reference bundle (6 concepts).examples/*.nt RDF triples produced from the example frontmatter.README.md How the pieces fit and how to regenerate them.Appendix B — Prefixes
Section titled “Appendix B — Prefixes”| Prefix | Namespace |
|---|---|
lokf |
https://w3id.org/lokf/ |
schema |
http://schema.org/ |
dcat |
http://www.w3.org/ns/dcat# |
dcterms |
http://purl.org/dc/terms/ |
prov |
http://www.w3.org/ns/prov# |
skos |
http://www.w3.org/2004/02/skos/core# |
foaf |
http://xmlns.com/foaf/0.1/ |
rdfs |
http://www.w3.org/2000/01/rdf-schema# |
owl |
http://www.w3.org/2002/07/owl# |
xsd |
http://www.w3.org/2001/XMLSchema# |
LOKF v0.1 is a draft profile and is not affiliated with or endorsed by Google. “Open Knowledge Format” and “OKF” refer to the format published by Google Cloud; LOKF builds on it under its open terms.