Code barrier
- Requires Python/R
- Building from scratch takes 3–4 months
- Domain experts locked out
- No data advantage without years of engagements
binary classification · no code · GDPR ready
Most tools show you the optimistic number. We show you both — and our model learns from every engagement so the gap keeps closing.
v3.1.0 live · 174 tests passing · no vendor lock-in
The problem
Code, compliance, and explainability slow down binary classification work before the modelling even starts.
What makes us different
The model trains on anonymised data from all engagements, human-reviewed, never autonomous.
model_families: [naive_bayes, logistic, random_forest, xgboost, lightgbm] learns_from_clients: true # anonymised, aggregated data_shared_between_clients: false # never self_updating: false # human review on every update evaluation: "nested_cv" output: "actionable_recommendations" audit_trail: true
How it works
A controlled path from data readiness to reproducible output.
Data cleaning: on-site audit, anonymisation, leakage detection
billableCleaned CSV loaded, screened
Target column, split strategy
Five families, nested CV
automatedDecision Letters, SHAP, audit trail
outputSecurity & compliance
Controls are designed around data minimisation, reproducibility, human oversight, and honest scoring.
Your data trains our model for your project only, never exposed to other clients.
Identifiers removed, IDs hashed, sensitive attributes generalised before modelling.
Dataset fingerprint, package versions, seed and config logged per run, reproducible offline.
We review every model update, no autonomous retraining.
GDPR, HIPAA, Basel III, EU AI Act addressed by design.
Nested CV corrects for selection bias, both scores shown side-by-side.
Capability comparison
A focused view of service capabilities for regulated binary classification work.
| Capability | DataRobot | Azure AutoML | H2O | Data Owl |
|---|---|---|---|---|
| No-code training | ✓ | ✓ | ✕ | ✓ |
| On-site data cleaning | ✕ | ✕ | ✕ | ✓ |
| Nested-CV honest score | ✕ | ✕ | ✕ | ✓ |
| Model improves per engagement | ✕ | ✕ | ✕ | ✓ |
| Human quality control | ✕ | ✕ | ✕ | ✓ |
| Per-row Decision Letters | Enterprise only | ✕ | ✕ | ✓ |
| Full reproducibility (YAML + log) | ✕ | ✕ | ✕ | ✓ |
Based on public documentation, 2026.
FAQ
Direct answers on data sharing, protection, regulation, model control, build time, and deliverables.
No. Your data is used exclusively within your engagement. It contributes to improving the model in fully anonymised form only — never exposed to, accessible by, or traceable to any other client.
Before any modelling takes place, direct identifiers are removed, customer IDs are hashed, and sensitive attributes are generalised. The model only ever processes anonymised data. We retain the minimum necessary for the analytical question.
Data minimisation at source: identifiers removed, IDs hashed, sensitive attributes generalised before modelling. In line with GDPR Article 5(1)(c). We do not transfer data to third parties.
The EU AI Act requires transparency and human oversight for AI-assisted decisions. Our outputs include per-row SHAP explanations and Decision Letters that document the reasoning behind every recommendation — suitable for regulatory review. We also address HIPAA and Basel III model risk requirements.
We do. The model does not self-update or retrain autonomously. Every update is reviewed and approved by our team before deployment — a deliberate design choice to guarantee consistent, predictable quality.
A programmer building a comparable pipeline from scratch would need 3–4 months: setting up model families, writing evaluation logic, running tests, iterating on bugs. And they still wouldn’t have the accumulated data advantage that comes from years of real client engagements.
Actionable recommendations — not a model to maintain. Per-row Decision Letters, SHAP-based driver explanations, and a full audit trail (YAML config + run log) ready for internal review or regulatory submission.
Services
Contact