Integration guide — get value in 5 minutes, not 5 days¶
The shortest path from pip install to running ontology against your data. If you read nothing else in this repository, read this. Deeper reference lives in install.md and the docs in this directory.
Contents¶
- 5-minute SQLite path
- 30-minute existing-database path
- What you actually got
- Adding LLMs / drift / federation later
- You probably don't need (yet)
5-minute SQLite path¶
Three commands, one bundled SQLite database, no API keys, no driver compiles.
# 1. Install (~50 MB, no compile)
pip install ontoforge
# 2. Run the full pipeline against the bundled retail demo DB
ontoforge --db db/demo.db --out output/demo
# 3. Open the report
open output/demo/reports/toolkit_report.html
That's it. You now have a working OWL ontology, SHACL shapes, JSON-LD context, mapping workbook, and a governance scorecard — all under output/demo/. Skim the HTML report and you've seen everything the pipeline produces.
To use your own SQLite file, swap in its path. Nothing else changes.
30-minute existing-database path¶
For PostgreSQL, MySQL, SQL Server, Oracle, or DB2.
Step 1 — install the right driver only¶
pip install -r requirements-core.txt
pip install psycopg2-binary # or mysql-connector-python / pyodbc / oracledb / ibm_db
Step 2 — add two system tables¶
The toolkit needs a small annotation control plane in your DB. Two tables, one-time setup, no changes to your existing tables:
-- copy from db/schema.sql (the relevant CREATE TABLE statements)
CREATE TABLE ontology_metadata (
table_name TEXT NOT NULL,
column_name TEXT,
semantic_class TEXT, -- e.g. 'Customer', 'Order', 'NetworkFunction'
business_term TEXT, -- plain-language meaning of this column
sensitivity TEXT, -- Public | Internal | Confidential | Restricted
-- ...other optional columns; see db/schema.sql for the full list
PRIMARY KEY (table_name, column_name)
);
CREATE TABLE semantic_loss_log (
finding_id TEXT PRIMARY KEY,
severity TEXT, -- info | warn | error
description TEXT,
captured_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Step 3 — annotate the columns that matter¶
You don't need to annotate every column. The minimum that produces a useful ontology is:
- One row per important table with a
semantic_class(e.g.Customer,Order). - One row per primary key column so it becomes the identifier.
- Sensitivity tier on any PII or restricted column.
Skip everything else on day 1. Re-run the pipeline and refine over time.
INSERT INTO ontology_metadata (table_name, column_name, semantic_class, business_term, sensitivity) VALUES
('orders', NULL, 'Order', 'A customer purchase', 'Internal'),
('orders', 'id', 'Order', 'Order identifier', 'Internal'),
('orders', 'customer_id', 'Customer', 'Foreign key to the placing customer','Internal'),
('customers', NULL, 'Customer', 'A person or organisation', 'Confidential'),
('customers', 'email', 'Customer', 'Primary contact address', 'Confidential');
Step 4 — run the pipeline¶
python3 toolkit.py --db "postgresql://user:pass@host/mydb" --out output/
open output/reports/toolkit_report.html
If you'd rather start from a wizard with no SQL writing, see install.md §2.
What you actually got¶
After output/ is populated, three files matter on day 1:
| File | Why you care |
|---|---|
output/reports/toolkit_report.html |
One-page visual run summary — open this first |
output/ontology/enterprise.ttl |
The OWL 2 ontology you can hand to any RDF tool, reasoner, or graph store |
output/jsonld/enterprise-context.json |
The JSON-LD context used by every downstream agent payload |
Everything else (SHACL shapes, SKOS vocab, mapping workbook, MCP tool definitions) is real value but not required to evaluate the toolkit's output.
Adding LLMs / drift / federation later¶
These are opt-in. Skip them on day 1 — the pipeline runs fine without any of them.
| When you want to… | Install | Read |
|---|---|---|
| Wire LLM calls through the SHACL gates and OWL grounding | pip install -r requirements-runtime.txt |
docs/runtime.md · docs/sdk.md |
| Monitor production data drift against the ontology | pip install -r requirements-drift.txt |
examples/infodrift/README.md |
| Use the browser-based wizard | pip install -r requirements-advanced.txt |
install.md §2 |
| Publish to Fuseki / Stardog / Neptune / GraphDB | pip install -r requirements-advanced.txt (Neptune only) |
docs/advanced.md |
| Federate with another organisation's ontology | core only | docs/advanced.md |
| Generate compliance evidence bundles | core only | docs/advanced.md |
You probably don't need (yet)¶
These are advanced features. Useful when you have a specific need — distracting otherwise. Each lives behind a --phase flag and is documented inside docs/advanced.md.
--phase reasoner— OWL 2 consistency checking via ROBOT (needs Java + a 100 MB jar)--phase modular— splits the ontology into importable modules with cycle detection--phase discover— NLP entity discovery from log corpora (requires spaCy + an English model)--phase tmf630— TMF Open API Task + Bulk operations--phase evolve— autonomous ontology evolution proposals--phase federate— cross-enterprise federation with signed manifests--phase comply— regulatory evidence bundles (EU AI Act, Basel IV, HIPAA, Ofcom)--phase embed/--phase retrieve— ontology-bounded vector retrieval--phase monitor— production drift monitoring (drift_monitor / infodrift)
Run python3 toolkit.py --help to see every flag, but ignore most of them on first contact.
When to read what¶
| Document | When |
|---|---|
| This file | First contact, integration recipe, "what's the minimum?" |
| install.md | You hit a database driver issue or want every connection-string format |
| features.md | You want to see the full capability map and decide where to read next |
| docs/artifacts.md | You want to know what a specific generated file is for |
| docs/runtime.md | You're wiring LLM calls through the toolkit's gates |
| docs/advanced.md | You're enabling drift monitoring, federation, compliance, vector retrieval, etc. |
| docs/sdk.md | You're writing application code that calls RuntimeClient, Grounder, etc. |
| examples/ | You want to see a runnable end-to-end demo |