ML ENGINEER · BOGOTÁ, COLOMBIA

Shipping production LLM, RAG, and agentic systems on AWS.

Machine Learning Engineer with hands-on experience shipping production LLM, RAG, and agentic systems on AWS for document-intensive enterprise pipelines. MSc in Biomedical Engineering with research on multimodal data collection, computer vision, and deep learning.

Comfortable working end-to-end across ML backend, serverless infrastructure (AWS Lambda, Step Functions, SAM), and full-stack delivery. Focused on turning frontier AI methods — LLM evaluation, retrieval, and human-in-the-loop workflows — into reliable, observable, cost-aware production systems.

STACK

Skills & tools

Programming

PythonTypeScriptJavaScriptRSQLBash

ML & AI

PyTorchTensorFlowscikit-learnLLM fine-tuning & evaluationRAGLangGraphLlamaIndexDSPyComputer VisionNLPSignal & medical-image processing

MLOps & Cloud

AWS LambdaStep FunctionsSAMBedrockSageMakerTextractS3DynamoDBIAMMLflowDockerGitLab CI

Data & Backend

PostgreSQLpgvectorAlembicDjangoPandasNumPyPydantic

Frontend

ReactTypeScriptReact QueryZustandVitest

Languages

English (Advanced)Spanish (Native)

WORK

Experience

  1. Jul 2025 — Present

    Machine Learning Engineer

    Provectus · San Francisco, CA · Remote

    Delivering LLM-powered features end-to-end across the stack — Python ML backend, AWS serverless infrastructure, and React/TypeScript frontend — for enterprise document-processing platforms.

    • Built production RAG-powered conversational assistants over structured extracted data using LangGraph, AWS Bedrock, and PostgreSQL with pgvector.
    • Designed scalable serverless extraction pipelines on AWS Step Functions, Lambda, Textract, and S3, with Alembic-versioned PostgreSQL and full SAM-based infrastructure-as-code.
    • Built model evaluation and optimization workflows with MLflow 3.4, DSPy-based LLM-as-judge, embedding-model benchmarking, and human-in-the-loop feedback loops.
    • Owned end-to-end delivery across ML backend, serverless infra, and React/TypeScript frontend for document-intensive enterprise pipelines.
    PythonAWS LambdaStep FunctionsSAMBedrockSageMakerTextractS3DynamoDBPostgreSQL (pgvector, Alembic)MLflowDSPyLangGraphPydanticReact/TypeScript
  2. Jul 2023 — Jun 2025

    Growth & Software Development Engineer

    JustPaid.ai (YC-W23) · San Francisco, CA · Remote

    Part-time agentic AI and full-stack development. Designed and shipped agentic workflows that turn unstructured PDFs into validated, structured records inside the product.

    • Designed and implemented agentic AI workflows for automated information extraction from unstructured documents (PDFs), including NLP-based text extraction, structured parsing into Pydantic models, and self-validation loops with agent-based quality supervision.
    • Built end-to-end data pipelines integrating ML models with PostgreSQL databases — preprocessing, classification, and structured storage of extracted entities (customers, contracts, line items).
    PythonLlamaIndexLangGraphDjangoPostgreSQL

    Reference: Daniel Kivatinos · daniel@kivatinos.com

  3. Aug 2023 — Jun 2025

    Teaching Assistant

    Universidad de los Andes · Bogotá, Colombia

    Led laboratory sessions on machine learning fundamentals: optimization techniques, linear and logistic regression, analytical OLS solutions, hyperparameter tuning, and neural network architectures.

    • Managed ~70 students per semester; authored lab resources and assessed student work.
    • Average student rating: 4.90 / 5.00.
    PythonRscikit-learnTensorFlowPyTorch

    Reference: Luis Felipe Giraldo Trujillo · lf.giraldo404@uniandes.edu.co

BUILT

Featured projects

Screenshot of MSc Thesis: Colombian Sign Language Analysis

Research · Multimodal ML

MSc Thesis: Colombian Sign Language Analysis

Recognize, classify, and biomechanically characterize Colombian Sign Language (LSC), and differentiate deaf signers from interpreters using a multi-perspective sensor stack.

Collected multimodal data from deaf signers and LSC interpreters via egocentric vision, conventional cameras, IMU, and EMG in static and conversational settings. Trained Random Forest, KNN, and Gradient Boosting on raw statistical, temporal, and spectral features, and fine-tuned a Video-Visual Transformer (ViViT) on raw video. Reached 95% accuracy on deaf-vs-interpreter classification and 40% accuracy on 50-sign recognition.

PythonPyTorchHugging FaceViViTIMUEMG
Screenshot of ML for Urological Disease Diagnosis

Healthcare · Clinical ML

ML for Urological Disease Diagnosis

Identify and explain diagnostic disagreement among urologists at Fundación Santa Fe de Bogotá, and use the model to drive consensus.

Built decision-tree models to flag discrepancies in urologic disease diagnostics across clinicians and extracted feature importances to explain disagreement. Inter-clinician agreement improved from 50% to 75% after model-informed standardization sessions, with feature importance feeding directly into an improved diagnostic workflow.

Pythonscikit-learnPandasNumPy
Screenshot of Personal Portfolio

Web · Open source

Personal Portfolio

Build a fast, accessible single-page portfolio that reflects the work, not the template.

Next.js 15 App Router with Tailwind v4, shadcn/ui primitives, Framer Motion reveals, and a strict dark palette. Deployed on Vercel.

Next.jsReactTypeScriptTailwind

WRITING

Publications

  1. Gomez S, et al. “Machine Learning Analysis of Colombian Sign Language: Recognition, Classification, and Biomechanical Characterization.”

    Journal article

    Under review
  2. Gomez S, et al. “AI-Based Platform for Automated Uroflowmetry Curve Morphology Classification.”

    ICS-EUS 2025

    Conference abstract
  3. Gomez S, et al. “Improving Uroflowmetry Interpretation: Effects of Standardization Sessions on Interobserver Agreement and AI Model Consistency.”

    ICS-EUS 2025

    Conference abstract

STUDY

Education

  1. Aug 2023 — Dec 2025

    MSc in Biomedical Engineering

    Universidad de los Andes · Bogotá, Colombia

    ThesisMachine Learning Analysis of Colombian Sign Language: Recognition, Classification, and Biomechanical Characterization.

    CourseworkMachine Learning for Engineering · Reinforcement Learning · Analysis & Processing of Medical Images

    GPA 4.78 / 5.00
  2. Jan 2019 — Dec 2022

    BSc in Biomedical Engineering — Minor in Neuroscience

    Universidad de los Andes · Bogotá, Colombia

    CourseworkData Structures & Algorithms · Scientific Programming · Signal Processing · Neuroscience · Neuroanatomy

    GPA 4.08 / 5.00

CREDENTIALS

Certificates

AWS Certified Machine Learning Engineer — Associate

Amazon Web Services

Jan 2026

AWS Cloud Practitioner Essentials

Amazon Web Services

Jul 2025

Rapid Application Development with Large Language Models (LLMs)

NVIDIA

May 2025

Efficient Large Language Model (LLM) Customization

NVIDIA

May 2025

Building LLM Applications with Prompt Engineering

NVIDIA

Apr 2025

Show all (9)