binary classification · no code · GDPR ready

The score your model actually gets in production.

Most tools show you the optimistic number. We show you both — and our model learns from every engagement so the gap keeps closing.

v3.1.0 live · 174 tests passing · no vendor lock-in

The problem

Three barriers block every analytical question.

Code, compliance, and explainability slow down binary classification work before the modelling even starts.

Code barrier

  • Requires Python/R
  • Building from scratch takes 3–4 months
  • Domain experts locked out
  • No data advantage without years of engagements

Compliance barrier

  • Cloud vendors need DPAs
  • Sensitive data can’t leave org
  • GDPR/HIPAA/Basel III friction

Explainability barrier

  • Tools return a number, no audit trail
  • EU AI Act requires transparency
  • Regulators want why, not just what

What makes us different

A model that learns from every engagement.

The model trains on anonymised data from all engagements, human-reviewed, never autonomous.

What is a meta-model?

A meta-model is a model about models. Instead of picking one algorithm and hoping it’s right, Data Owl trains multiple model families on the same data and compares them under identical, rigorous conditions.

The advantage is in the database: a continuously growing collection of anonymised data from real client engagements, on which the model keeps improving. Data Owl delivers this as a service: clients send data, receive recommendations.

model_core.yaml read-only
model_families: [naive_bayes, logistic, random_forest, xgboost, lightgbm]
learns_from_clients: true   # anonymised, aggregated
data_shared_between_clients: false   # never
self_updating: false   # human review on every update
evaluation: "nested_cv"
output: "actionable_recommendations"
audit_trail: true

Why faster

  • Five families trained in one session
  • No glue code — fully automated
  • Model pre-informed by prior engagements
  • Days from data handoff to output

Why safer

  • Every run serialised to YAML
  • Additive-only audit log
  • Human-reviewed quality control
  • Outputs labelled indicative, not decisions

How it works

Five steps. Raw data to actionable recommendations.

A controlled path from data readiness to reproducible output.

01
Consultancy

Data cleaning: on-site audit, anonymisation, leakage detection

billable
02
Upload

Cleaned CSV loaded, screened

03
Evaluation

Target column, split strategy

04
Train & tune

Five families, nested CV

automated
05
Deliver

Decision Letters, SHAP, audit trail

output

Security & compliance

Built for regulated industries — not retrofitted.

Controls are designed around data minimisation, reproducibility, human oversight, and honest scoring.

Data never shared

Your data trains our model for your project only, never exposed to other clients.

PII removed upfront

Identifiers removed, IDs hashed, sensitive attributes generalised before modelling.

Full audit trail

Dataset fingerprint, package versions, seed and config logged per run, reproducible offline.

Human quality control

We review every model update, no autonomous retraining.

Regulation-ready

GDPR, HIPAA, Basel III, EU AI Act addressed by design.

Honest scoring

Nested CV corrects for selection bias, both scores shown side-by-side.

Capability comparison

Where competitors stop — and we continue.

A focused view of service capabilities for regulated binary classification work.

Capability DataRobot Azure AutoML H2O Data Owl
No-code training
On-site data cleaning
Nested-CV honest score
Model improves per engagement
Human quality control
Per-row Decision Letters Enterprise only
Full reproducibility (YAML + log)

Based on public documentation, 2026.

FAQ

Seven answers before the first call.

Direct answers on data sharing, protection, regulation, model control, build time, and deliverables.

Is our data shared with other clients?

No. Your data is used exclusively within your engagement. It contributes to improving the model in fully anonymised form only — never exposed to, accessible by, or traceable to any other client.

How do you make sure our data is protected?

Before any modelling takes place, direct identifiers are removed, customer IDs are hashed, and sensitive attributes are generalised. The model only ever processes anonymised data. We retain the minimum necessary for the analytical question.

How do you handle GDPR?

Data minimisation at source: identifiers removed, IDs hashed, sensitive attributes generalised before modelling. In line with GDPR Article 5(1)(c). We do not transfer data to third parties.

What other EU regulations do you address?

The EU AI Act requires transparency and human oversight for AI-assisted decisions. Our outputs include per-row SHAP explanations and Decision Letters that document the reasoning behind every recommendation — suitable for regulatory review. We also address HIPAA and Basel III model risk requirements.

Who controls the model? Can it learn on its own?

We do. The model does not self-update or retrain autonomously. Every update is reviewed and approved by our team before deployment — a deliberate design choice to guarantee consistent, predictable quality.

How long would it take to build this ourselves?

A programmer building a comparable pipeline from scratch would need 3–4 months: setting up model families, writing evaluation logic, running tests, iterating on bugs. And they still wouldn’t have the accumulated data advantage that comes from years of real client engagements.

What do we receive at the end?

Actionable recommendations — not a model to maintain. Per-row Decision Letters, SHAP-based driver explanations, and a full audit trail (YAML config + run log) ready for internal review or regulatory submission.

Services

From raw data to actionable recommendations.

01

Data cleaning consultation

02

Data upload & screening

03

Evaluation plan

04

Train & tune

05

Deliver recommendations

Contact

Request a call.

we respond within 1 business day · data not stored by third parties

Thanks — we’ll be in touch within 1 business day.