Upload a spreadsheet, connect a database or API. KatCore labels every column, scores its quality, and answers your questions in natural language — all in your browser. No SQL, no setup.
| Column | Type | Auto description |
|---|---|---|
| PII | Customer email address — unmasked, flagged for exposure | |
| signup_date | Date | Account creation date, parsed as ISO-8601 |
| mrr | Currency | Monthly recurring revenue, USD |
| region | Region | Sales territory — 6 distinct values |
SaaS subscriptions grew 14.2% in Q3, driven by the Enterprise tier (+22% seats). Source: sales_report.csv
Bring data in, understand and trust it, keep it fresh, and walk away with a report. Four steps, one workspace, zero pipelines to build.
Drag and drop your files with real per-file progress — or pull straight from a public URL, a REST API, or a SQL database. Parsing, cleaning, and storage happen for you.
Spreadsheets, PDFs, databases, or live APIs — drop them in and KatCore handles parsing, cleaning, and storage. No pipelines to build.
On ingest, every column is labeled and described automatically. Then the Readiness Audit scores your data across six weighted dimensions — fully explainable, no black box — and hands you a prioritized fix-list where every fix shows the points it recovers. The checklist literally is your score, decomposed.
KatCore reads every column, scores how trustworthy your data is, and answers your questions in natural language — so you can act, not audit.
EMEA had the highest churn at 6.8%, nearly double the 3.5% global average — concentrated in the SMB segment. Source: customers_q3.csv
Point KatCore at a URL or API and it re-ingests on a timezone-aware cron. Smart polling caches ETag and Last-Modified, so an unchanged source is a no-op — never duplicate data.
Point KatCore at a URL or API and it refreshes on your schedule — and skips the pull entirely when nothing changed.
The audit produces a real Jupyter notebook inside KatCore — an AI-Readiness scorecard, a natural-language narrative of findings with evidence, and a single DuckDB SQL block that applies every fix. Edit cells in place, re-run the audit to watch the score climb, or download the notebook to open anywhere.
Score, findings, evidence, and the exact SQL to fix your data — as a live notebook, not a dead PDF. Bring your own notebooks too.
Completeness 92 · Validity 78 · Uniqueness 95 · PII 60 · Consistency 84 · Semantic 100
Findings. Column email holds 312 unmasked addresses (critical). mrr shows 47 outliers beyond the IQR fence [12, 840]. signup_date has 22 unparseable values at rows 14, 89, 203…
Kat doesn't stop at a single file. It pulls in every dataset your question touches, registers them together, and writes one query that joins them — CTEs, window functions, and all — then answers in natural language with every source cited.
EMEA has the steepest return rate at 9.4% — $84.2K refunded against $897K in sales, more than double the 4.1% company average. APAC follows at 6.7%. Sources: sales_2025.csv · returns_2025.csv
KatCore puts large language models and embeddings to work where they earn it — understanding your data, answering your questions, and cleaning the mess. The numbers you rely on stay deterministic.
On ingest, an LLM reads and describes every column, and embeddings map what your data means — so search and answers are grounded in your real schema, not guesswork.
LLM + embeddingsAsk Kat a question. It reads your intent, finds the right files by meaning, writes and runs the query, then explains the result — with the source file cited.
Intent · retrieval · synthesisAI reconciles "USA" vs "U.S.A.", flags values that don't belong, parses messy dates, and detects PII — then writes the exact fix. You preview before anything changes.
LLM + embeddings + NERKat never invents a number. Every answer is computed from your actual data and traces back to the rows and the source file it came from — so you can verify it, not just trust it.
Computed · traceableScores, statistics, and duplicate detection stay deterministic and fully explainable — no model guesswork in the numbers you trust.
Every column is labeled and described on ingest. KatCore understands your data before you even ask.
Ask a question, get a written answer with the numbers and the source file cited. No SQL required.
A 0–100 readiness score with a point-by-point fix-list. Know exactly what's wrong and how to fix it.
Every workspace is logically separated and encrypted at rest. Your data stays yours.
Start free and upload your first dataset in minutes. Upgrade when your team grows.
Need higher limits or on-prem deployment? Talk to us.