Experience

Christian Vega.

AI / Machine Learning Engineer — LLM · RAG · Production Data & ML Systems

AI/ML engineer who has solo-architected and runs three live production systems — a RAG procurement-intelligence product over a ~1.24M-contract corpus, an ML pricing engine with calibrated 80% prediction intervals, and a full business operating system live on AWS — on top of six years across regulated healthcare and payer data (Optum, Humana). A player-coach who owns architecture, model evaluation, and production ops end-to-end, and goes hands-on through the full stack.

Puerto Rico (U.S.) — authorized to work in the U.S., remote-ready · Bilingual EN/ES

Professional experience

Senior Data Analyst — Data & AI Engineering

Optum (UnitedHealth Group) Mar 2025 – Present

Python · SQL · Snowflake · Azure Data Factory · LLM integration

  • Architected and shipped two in-house, LLM-powered automation products now in production across enterprise healthcare data — an automated Business Requirements Document (BRD) generator and a data-dictionary context generator — replacing documentation work previously produced by hand by business analysts.
  • Owned each product end-to-end: scope, architecture, LLM and prompt design, output evaluation, and production rollout under HIPAA-governed data constraints.
  • Led migration of legacy systems onto a modern, automated, ML/LLM-driven data platform, owning architecture and delivery.
  • Set technical direction and coordinated cross-functional stakeholders on AI and automation initiatives, from design through production.

Founder & AI/Data Engineer (Independent)

Vizlogic — vizlogic.tech (concurrent) Jan 2024 – Present

Python · FastAPI · PostgreSQL/pgvector (HNSW) · LightGBM · Nuxt 3 · TypeScript · Docker · AWS (Lightsail / ECS Fargate) · Cloudflare

  • Solo-architected, shipped, and operate three live production systems end-to-end — data ingestion, ML, application, and production ops — using AI coding agents as an accelerated implementation layer under my own architecture, model-evaluation, and production-ops control.
  • Nidopr (live at nidopr.app) — bilingual real-estate marketplace: ~64,000 listings ingested nightly from 10+ sources via resumable crawlers; a LightGBM valuation model with calibrated 80% prediction intervals (~79.6% empirical coverage) and a trust gate that cut model bias from +44% to +5% in back-testing.
  • Nidopr production ops — 34 unattended nightly jobs and a 40-check monitoring layer alerting to phone, Discord, and email; an autonomous Facebook poster publishing 7 data-selected listings/day with OCR photo-safety screening.
  • LicitaPR Inteligencia — RAG-backed procurement-intelligence product unifying six government data systems: ~1.24M contracts ingested (~90% of the universe) with Splink entity resolution (422K → ~345K entities); hybrid retrieval over pgvector/HNSW embeddings; kept green by a 62/62-check QA harness.
  • FT-OS — complete small-business operating system (visual builder, POS, kitchen display, inventory, admin) live on AWS in daily production use, held green through a 148-check QA battery and git push-to-deploy.

Data Analyst

TrueNorth Corporation — client: PRASA (PR Aqueduct & Sewer Authority) Jan 2023 – Feb 2025

MSSQL Server · SSIS · ETL · Python · Power BI · SSRS · REST APIs

  • Built an automated ETL pipeline (SSIS + Python) consolidating data across multiple isolated silos — collapsing a manual compilation that previously took a business analyst ~2 months of full-time work, plus developer and multi-department support, into a single repeatable run.
  • Designed and shipped end-to-end data pipelines and BI dashboards for PRASA, moving operational reporting from manual handling to automated, near-real-time delivery.
  • Engineered REST API integrations to pull and process external sources into the warehouse, improving dashboard data coverage and accuracy.
  • Automated reporting (MSSQL + Python + SSRS), cutting report-generation time ~30%.
  • Led data-governance initiatives across PRASA analytics — integrity, security, and compliance.

Quality Analyst

Next Level Solutions Jun 2021 – Dec 2022

SQL · Postman · Snowflake · SSIS · SSMS · Azure DevOps

  • Built automated test frameworks (SQL + API validation via Postman) that cut manual testing time ~30%.
  • Ran Snowflake data-model query testing and validation (SSIS) to certify accurate reporting structures.
  • Planned deployments and test cases with cross-functional teams (~20% faster releases) and streamlined API testing to reduce unresolved bugs ~15%.

Compliance Leader

Humana (Medicare Advantage) Apr 2019 – May 2021

SQL · Power Automate · SSIS · IBM Cognos · Tableau

  • Led compliance initiatives — automated audit workflows and data validation (SQL, Power Automate) — reaching a 95% compliance rate.
  • Conducted data-integrity audits (SSIS, IBM Cognos) that reduced regulatory reporting errors ~10%, and drove ~25% faster claims/inquiry resolution via automated Tableau dashboards.
  • Enforced HIPAA and Medicare Parts C & D adherence through data-governance policies and encryption.

Skills

Machine Learning & AI

Machine Learning (LightGBM / gradient boosting)Conformal prediction / calibrated prediction intervalsModel evaluation & back-testing (Diebold-Mariano)Retrieval-Augmented Generation (RAG)Vector search (pgvector / HNSW)Embeddings & hybrid retrievalLLM integration (Azure OpenAI, Snowflake Cortex, Ollama)Agentic / multi-agent orchestrationEntity resolution (Splink)OCR (Tesseract / EasyOCR)Model deployment & monitoring (MLOps / LLMOps)

Data Engineering & Pipelines

PythonSQL (advanced)ETL / ELTData pipelines (production-grade)Azure Data FactorySSISdbtREST API integrationResumable / idempotent crawlersWeb scraping (Playwright / Patchright)Data modelingData quality

Data Platforms & Warehousing

SnowflakeData warehousingPostgreSQL (pgvector, FTS)MSSQL ServerMySQLGoogle BigQueryAWS Data LakeSQLite

Cloud, MLOps & Production Ops

AWS (Lightsail; ECS Fargate + EFS)Azure (Data Factory, DevOps)DockerCI/CD (git push-to-deploy)systemdCloudflare (Tunnel, Pages, DNS)TailscaleProduction monitoring / observabilityIncident root-cause analysis (RCA)Restore-verified backups (restic)

BI, Governance & Compliance

Power BITableauLookerSSRSEChartsHIPAAMedicare Parts C & DData governanceMetadata managementData securityAutomated testing & validation

Languages & Domain

PythonSQLTypeScript / JavaScript (Nuxt 3 / Vue)PowerShellBashHealthcare / payer data (Optum, Humana)Regulated / high-stakes dataBilingual EN/ES

Certifications

  • IBM Data Science Specialization
  • Python Developer — Zero to Mastery

Education

  • Associate's Degree in Biotechnology — 2016 (a hypothesis-driven scientific foundation carried into data and ML work)

Have data? Let’s make it think.

Open to senior / lead data & AI roles, and to Vizlogic consulting engagements.