Senior AI Engineer (Evals/Observability Concentration)

Risepointvia Built In

RemotoUsSêniorCLTOntem

Salário Estimado

R$ 12.870,00 - R$ 19.305,00

Tecnologias

Python Java C#Go Rust AWS Azure GCP IA

0de 100

Excelente

Score da Vaga

Descrição da Vaga

Risepoint is an education technology company that provides world-class support and trusted expertise to more than 100 universities and colleges.

We primarily work with regional universities, helping them develop and grow their high-ROI, workforce-focused online degree programs in critical areas such as nursing, teaching, business, and public service.

Risepoint is dedicated to increasing access to affordable education so that more students, especially working adults, can improve their careers and meet employer and community needs.

The Impact You Will Make Risepoint is developing an AI-powered Student Journey Platform and is seeking a Senior AI Engineer with deep expertise in Retrieval-Augmented Generation (RAG), multi-agent architectures, and LLM evaluation frameworks.

This role focuses on designing, implementing, and operationalizing AI systems with a strong emphasis on structured evaluation (including LLM-as-Judge), measurable quality, and production-grade reliability.

The ideal candidate has experience integrating LLMs with enterprise data sources, building testable and observable AI workflows, and improving system performance through rigorous evaluation and iteration.

This role contributes directly to a platform that is central to the organization’s long-term strategy.

How You Will Bring Our Mission to Life What You Will Do • Build and maintain evaluation frameworks (LLM-as-Judge, rubric-based scoring, regression test suites) to measure output quality, reliability, and drift with the responsibility of debugging production level issues as detected.

•Architect and implement multi-agent workflows with clear coordination, tool usage, and failure handling patterns. • Build structured observability into AI systems (tracing, prompt/version tracking, evaluation logging, cost and latency monitoring).

•Define and enforce quality gates for AI features using automated evals prior to production release. • Optimize inference performance (latency, token usage, caching, batching, routing across models).

•Collaborate with product and engineering teams to translate business requirements into testable AI system designs. • Contribute to code reviews, architectural discussions, and internal standards for AI development.

•Design and implement Retrieval-Augmented Generation (RAG) systems and Model Context Protocol (MCP) servers using structured and unstructured enterprise data. • Develop and manage fine-tuning workflows (SFT, preference optimization, or related techniques) including dataset preparation, versioning, and validation.

What Success Looks Like • RAG pipelines return grounded, source-attributed responses with minimal hallucination.

•Evals are automated, reproducible, and integrated into CI/CD or release workflows. • Multi-agent workflows are observable, testable, and maintainable as complexity increases.

How Impact Will be Measured • AI systems demonstrate measurable improvements in quality using defined evaluation benchmarks.

•Fine-tuned models and/or programmatic solutions show validated performance gains over baseline foundation models. • AI systems meet defined SLAs for latency, reliability, and cost.

What You’ll Bring to the Team Experience That Matters Most • 3-5 years of full stack engineering experience with strong fundamentals in object-oriented programming, applicable design patterns, and AI-focused system design.

•Professional experience in Python, C#, Java, or a similar language used in production systems. • Experience with LLM evaluation and observability tooling (e.g.

Langfuse, LangSmith, OpenTelemetry-based tracing, custom evaluation harnesses). • Experience implementing guardrails, policy enforcement, and safety layers in AI driven systems while leveraging LLM-as-Judge for validation and continuous improvement.

•Experience That’s Great to Have • Familiarity with performance optimization techniques for LLM-based systems (latency, caching, routing, batching).

•Experience building production-grade RAG systems (retrieval pipelines, chunking strategies, embeddings, reranking, context construction). • Experience contributing to internal AI standards, reusable frameworks, or platform-level tooling.

•Experience deploying AI systems in cloud environments (AWS, Azure, GCP).

Experience in Databricks (model serving endpoints, ML Flow) Risepoint is an equal-opportunity employer and supports a diverse and inclusive workforce.

Requisitos

Risepoint is developing an AI-powered Student Journey Platform and is seeking a Senior AI Engineer with deep expertise in Retrieval-Augmented Generation (RAG), multi-agent architectures, and LLM evaluation frameworks
RAG pipelines return grounded, source-attributed responses with minimal hallucination
Multi-agent workflows are observable, testable, and maintainable as complexity increases
3-5 years of full stack engineering experience with strong fundamentals in object-oriented programming, applicable design patterns, and AI-focused system design
Professional experience in Python, C#, Java, or a similar language used in production systems
Experience with LLM evaluation and observability tooling (e.g. Langfuse, LangSmith, OpenTelemetry-based tracing, custom evaluation harnesses)
Experience implementing guardrails, policy enforcement, and safety layers in AI driven systems while leveraging LLM-as-Judge for validation and continuous improvement
Experience That’s Great to Have
Familiarity with performance optimization techniques for LLM-based systems (latency, caching, routing, batching)
Experience building production-grade RAG systems (retrieval pipelines, chunking strategies, embeddings, reranking, context construction)
Experience contributing to internal AI standards, reusable frameworks, or platform-level tooling
Experience deploying AI systems in cloud environments (AWS, Azure, GCP)
Experience in Databricks (model serving endpoints, ML Flow)

Responsabilidades

This role focuses on designing, implementing, and operationalizing AI systems with a strong emphasis on structured evaluation (including LLM-as-Judge), measurable quality, and production-grade reliability
The ideal candidate has experience integrating LLMs with enterprise data sources, building testable and observable AI workflows, and improving system performance through rigorous evaluation and iteration
This role contributes directly to a platform that is central to the organization’s long-term strategy
Build and maintain evaluation frameworks (LLM-as-Judge, rubric-based scoring, regression test suites) to measure output quality, reliability, and drift with the responsibility of debugging production level issues as detected
Architect and implement multi-agent workflows with clear coordination, tool usage, and failure handling patterns
Build structured observability into AI systems (tracing, prompt/version tracking, evaluation logging, cost and latency monitoring)
Define and enforce quality gates for AI features using automated evals prior to production release
Optimize inference performance (latency, token usage, caching, batching, routing across models)
Collaborate with product and engineering teams to translate business requirements into testable AI system designs
Contribute to code reviews, architectural discussions, and internal standards for AI development
Design and implement Retrieval-Augmented Generation (RAG) systems and Model Context Protocol (MCP) servers using structured and unstructured enterprise data
Develop and manage fine-tuning workflows (SFT, preference optimization, or related techniques) including dataset preparation, versioning, and validation
Evals are automated, reproducible, and integrated into CI/CD or release workflows
AI systems demonstrate measurable improvements in quality using defined evaluation benchmarks
Fine-tuned models and/or programmatic solutions show validated performance gains over baseline foundation models
AI systems meet defined SLAs for latency, reliability, and cost

Vagas Semelhantes

Senior React Full-stack Developer

Lemon.ioWe Work Remotely

RemotoRemoto4 dias atrás

R$ 16k - 25k/mês

SêniorCLT

Are you a talented Senior Developer looking for a remote job that lets you show your skills and get decent compensation? Look no further than Lemon.io — the marketplace that connects you with hand-picked startups in the US and Europe. What we offer: • The rate depends on your seniority level, skills...

JavaScript TypeScript React Angular Vue+13

Ver Detalhes

Senior Full-Stack Software Engineer (Python & Ai/Ml) – Full-Time, Remote

buscojobs BrasilBuscojob - Buscojobs

RemotoBr15 dias atrás

R$ 11k - 16k/mês

SêniorCLT

Senior Software Engineer (Python & AI/ML) Company: GTG LLC Company Stage: Startup / Growth-Stage Location: Remote Experience Level: Senior (7 years) About the Role GTG LLC is a growing startup building intelligent, scalable software solutions. We are seeking a Senior Software Engineer with deep expe...

JavaScript React Angular Vue Python+9

Ver Detalhes

Backend/Web Application Developer; Cloud & SaaS

ImFusion GmbHLearn4Good

RemotoOhio, Us12 dias atrás

R$ 6k - 8k/mês

PlenoCLT

Position: Backend / Web Application Developer (Cloud & SaaS) Location: Germany Your mission ImFusion is an R&D consulting company based in Munich. Thanks to our expertise in image processing, computer vision, AI and robotics applied to medical imaging, we help our customers drive innovation in medic...

React Node Python Java C#+9

Be part of an international, dynamic, and highly skilled team in flat hierarchies where you can both make an impact and continue to learnContribute to our company values and do good for society by having your work enable actual medical products that improve patients’ livesEarn a competitive salary and a comprehensive benefits package (such as bike leasing, sports programs, etc.)

Ver Detalhes

AI & LLM Developer — Senior

Open InsuranceIndeed

RemotoRemoto10 dias atrás

R$ 16k - 25k/mês

SêniorCLT

Location: Remote or Hybrid (if US Located) Employment Type: Contract — Full-Time Department: Engineering / Product Development Experience Level: Senior (5–8+ years) Reports To: Director of Engineering Role Overview We are seeking a highly skilled Senior AI & LLM Developer with deep, hands-on experie...

JavaScript TypeScript Python Java Go+15

Competitive contract compensation commensurate with experiencePay: From $4,000.00 per month

Ver Detalhes

Interessado nesta vaga?

Candidatar-se

Você será redirecionado para o site original

Informações

NívelSênior

ContratoCLT

LocalUs

RemotoSim

MoedaBRL

PublicadaOntem

FonteBuilt In

Análise de Vaga com IA

Estimativa salarial, match de tecnologias e análise de requisitos feitos com Inteligência Artificial

← Voltar às Vagas

​Senior AI Engineer (Evals/Observability Concentration)

Tecnologias

Descrição da Vaga

Requisitos

Responsabilidades

Vagas Semelhantes

Senior React Full-stack Developer

Senior Full-Stack Software Engineer (Python & Ai/Ml) – Full-Time, Remote

Backend​/Web Application Developer; Cloud & SaaS

AI & LLM Developer — Senior

Interessado nesta vaga?

Informações

Análise de Vaga com IA

Senior AI Engineer (Evals/Observability Concentration)

Backend/Web Application Developer; Cloud & SaaS