Data tools
- Schema checks, row caps, and structured reject paths so bad inputs fail loudly instead of polluting downstream numbers.
- Pytest-backed pipelines and pinned assumptions where they matter; exports match what ran on the server.
- Optional LLM steps only see metrics and aggregates—never row-level free text—so scope stays reviewable.
Portfolio centers on batch- and API-shaped ML systems; this page shows the same habits in interactive surfaces—tight validation, visible limits, traceable outputs. Repos contain tests and deploy notes; Services is the short pointer for contracted work.
GitHub profile — opens everything public on my account. On this page, each project card links to that repo and its live demo in the footer.
Apps in this track
Data cleaning toolkit
Upstream prerequisite, not a side utility: downstream models and dashboards ingest the same reviewed tables—multi-format inputs become auditable CSV/Parquet/JSON plus HTML step log and before/after views, capped near 100K rows; rules cover bad formats, duplicates, skewed categories, optional outliers, plus bundled samples for dry runs.
JSON flattening stays one level by design. Pairs with EDA on this page (profile vs fix). Deploy mirrors the repository limits and validation logic.
EDA report generator
Read-only dossier you can archive or forward—no cell edits: capped sampling with sheet picker, full column intelligence, correlations, histograms, and warnings, rendered to HTML from memory with optional PDF when WeasyPrint is available. Single-column junk files fail fast.
Natural companion to the cleaning toolkit (inspect here, mutate there). PDF failures surface in UI while HTML export still succeeds. docs/DEPLOY_VPS.md covers self-host parity with the repo.
AI-assisted data analysis
Exploration plus written explanation—not KPI tracking: one business CSV yields profile, a fixed chart pack, rule-based quality hints, and optional OpenAI narrative built only from aggregates (never raw rows), bundled as one HTML story for readers who need context, not a metric wall.
UTF-8 CSV with config-driven caps; empty analysis still finishes and explains missing API keys. Suited to marketing/sales/customer-behaviour tables.
KPI dashboard app
Metric tracking and stability—not narrative exploration: preset mappings and hard row gates drive KPI cards, trends, breakdowns, optional deltas, and a rule-first “what changed?”; optional OpenAI reads only pre-aggregated KPI objects, never raw CSV rows.
Export a timestamped snapshot ZIP for handoffs. Narrow MVP—no warehouse connectors or enterprise RBAC; UI honours the same guardrails documented in config.
Forecasting app
Univariate demand curves with honest baselines—not neural nets: resampled Holt-Winters by day/week/month only when seasonality is supportable; Plotly band with MAPE/MAE against naive; warnings and fallback reasons stay on-screen for reviewers.
Sidebar-triggered run exports forecast CSV plus summary JSON. Bundled sample series and CLI smoke live in-repo; promo or holiday regressors stay out of scope with explicit naive fallback when fit is weak.
Each card links to source and a public instance; Services covers engagement scope.