Lead Machine Learning Engineer, LLM Infrastructure

100 Salesforce, Inc.via Workday

RemotoSan Francisco, California, UsSêniorCLT6 dias atrás

Salário Estimado

R$ 15.015,00 - R$ 22.523,00

Tecnologias

Python Go Rust AWS GCP Docker Kubernetes REST Machine Learning IA

0de 100

Excelente

Score da Vaga

Descrição da Vaga

To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.

Job Category Software Engineering Job Details About Salesforce Salesforce is the #1 AI CRM, where humans with agents drive customer success together.

Here, ambition meets action.

Tech meets trust.

And innovation isn’t a buzzword — it’s a way of life.

The world of work as we know it is changing and we're looking for Trailblazers who are passionate about bettering business and the world through AI, driving innovation, and keeping Salesforce's core values at the heart of it all.

Ready to level-up your career at the company leading workforce transformation in the agentic era? You’re in the right place! Agentforce is the future of AI, and you are the future of Salesforce.

About the Role We are seeking a Lead ML Engineer, LLM Post-Training Infrastructure to join the Salesforce AI Research Incubation Team.

In this role, you will own the infrastructure and engineering systems that support LLM post-training, large-scale evaluation, and model deployment.

You will build scalable, reliable pipelines for training orchestration, rollout generation, reward and feedback pipelines, experiment management, and model iteration, helping translate research ideas into production-grade systems.

This is an engineering-first role focused on ML infrastructure, distributed systems, and training/evaluation workflows rather than developing new model architectures or algorithms.

You will work closely with research scientists, agent engineers, and platform teams to operationalize post-training and feedback-driven learning methods into robust, reusable systems.

This is a lead-level individual contributor role with deep ownership of model-facing infrastructure and strong cross-functional influence.

Key Responsibilities: ● Design, build, and maintain infrastructure for LLM post-training, evaluation, and deployment. ● Own scalable pipelines for training orchestration, rollout generation, reward and feedback processing, checkpointing, and experiment management. ● Build reliable systems for feedback-driven model improvement, including human or AI feedback loops, large-scale offline evaluation, and regression detection. ● Partner closely with research scientists to turn new post-training methods into reusable engineering workflows. ● Collaborate with agent engineers and platform teams to integrate training and evaluation systems with production model and agent stacks. ● Optimize distributed training and inference workloads for reliability, throughput, cost efficiency, and observability. ● Drive best practices for reproducibility, versioning, monitoring, deployment, and operational excellence across ML systems.

Required Qualifications: ● 5+ years of experience in software engineering, ML systems, or distributed infrastructure. ● Strong proficiency in Python and experience building production systems or large-scale ML pipelines. ● Hands-on experience building infrastructure for model training, post-training, evaluation, or serving. ● Experience designing reliable, scalable systems for distributed and GPU-based workloads. ● Strong debugging skills across systems, pipelines, and model-facing failures. ● Experience building infrastructure for LLM post-training, including RLHF, preference optimization, reward modeling, or related feedback-driven training workflows. ● Experience working cross-functionally with research scientists and engineers. ● Familiarity with cloud platforms (AWS, GCP) and containerized environments (Docker, Kubernetes).

Preferred Qualifications: ● Experience with rollout systems, large-scale evaluation loops, or training data/feedback pipelines. ● Familiarity with distributed training frameworks and modern ML infrastructure stacks. ● Experience supporting agent-based learning, simulation environments, or iterative model improvement systems. ● Prior experience working closely with AI research or incubation teams.

Why Join Us? ● Own the systems that turn research models into production AI capabilities. ● Work at the intersection of AI research and large-scale engineering systems. ● Shape how models are trained, deployed, evaluated, and evolved. ● Competitive compensation, benefits, and strong long-term growth opportunities.

Unleash Your Potential When you join Salesforce, you’ll be limitless in all areas of your life.

Our benefits and resources support you to find balance and be your best, and our AI agents accelerate your impact so you can do your best.

Together, we’ll bring the power of Agentforce to organizations of all sizes and deliver amazing experiences that customers love.

Apply today to not only shape the future — but to redefine what’s possible — for yourself, for AI, and the world.

Accommodations If you need a reasonable accommodation during the application or the recruiting process, please submit a request via this Accommodations Request Form.

Please note that Salesforce uses artificial intelligence (AI) tools to help our recruiters assess and evaluate candidates’ resumes and qualifications throughout the recruiting process.

Humans will always make any candidate selection and hiring decisions.

Please see our Candidate Privacy Statement for more information about how we use your personal data and your rights, including with regard to use of AI tools and opt out options.

Posting Statement Salesforce is an equal opportunity employer and maintains a policy of non-discrimination with all employees and applicants for employment.

What does that mean exactly? It means that at Salesforce, we believe in equality for all.

And we believe we can lead the path to equality in part by creating a workplace that’s inclusive, and free from discrimination.

Know your rights: workplace discrimination is illegal.

Any employee or potential employee will be assessed on the basis of merit, competence and qualifications – without regard to race, religion, color, national origin, sex, sexual orientation, gender expression or identity, transgender status, age, disability, veteran or marital status, political viewpoint, or other classifications protected by law.

This policy applies to current and prospective employees, no matter where they are in their Salesforce employment journey.

It also applies to recruiting, hiring, job assignment, compensation, promotion, benefits, training, assessment of job performance, discipline, termination, and everything in between.

Recruiting, hiring, and promotion decisions at Salesforce are fair and based on merit.

The same goes for compensation, benefits, promotions, transfers, reduction in workforce, recall, training, and education.

In the United States, compensation offered will be determined by factors such as location, job level, job-related knowledge, skills, and experience.

Certain roles may be eligible for incentive compensation, equity, and benefits.

Salesforce offers a variety of benefits to help you live well including: time off programs, medical, dental, vision, mental health support, paid parental leave, life and disability insurance, 401(k), and an employee stock purchasing program.

More details about company benefits can be found at the following link: https://www.salesforcebenefits.com.

Pursuant to the San Francisco Fair Chance Ordinance and the Los Angeles Fair Chance Initiative for Hiring, Salesforce will consider for employment qualified applicants with arrest and conviction records.

At Salesforce, we believe in equitable compensation practices that reflect the dynamic nature of labor markets across various regions.

The typical base salary range for this position is $172,500 - $260,100 annually.

In select cities within the San Francisco and New York City metropolitan area, the base salary range for this role is $207,800 - $285,800 annually.

The range represents base salary only, and does not include company bonus, incentive for sales roles, equity or benefits, as applicable.

We're Salesforce, the Customer Company, inspiring the future of business with AI + Data + CRM.

Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way.

And, we empower you to be a Trailblazer, too — driving your performance and career growth, charting new paths, and improving the state of the world.

If you believe in business as the greatest platform for change and in companies doing well and doing good – you've come to the right place.

Requisitos

5+ years of experience in software engineering, ML systems, or distributed infrastructure
Strong proficiency in Python and experience building production systems or large-scale ML pipelines
Hands-on experience building infrastructure for model training, post-training, evaluation, or serving
Experience designing reliable, scalable systems for distributed and GPU-based workloads
Strong debugging skills across systems, pipelines, and model-facing failures
Experience building infrastructure for LLM post-training, including RLHF, preference optimization, reward modeling, or related feedback-driven training workflows
Experience working cross-functionally with research scientists and engineers
Familiarity with cloud platforms (AWS, GCP) and containerized environments (Docker, Kubernetes)
Work at the intersection of AI research and large-scale engineering systems
Shape how models are trained, deployed, evaluated, and evolved

Responsabilidades

In this role, you will own the infrastructure and engineering systems that support LLM post-training, large-scale evaluation, and model deployment
You will build scalable, reliable pipelines for training orchestration, rollout generation, reward and feedback pipelines, experiment management, and model iteration, helping translate research ideas into production-grade systems
This is an engineering-first role focused on ML infrastructure, distributed systems, and training/evaluation workflows rather than developing new model architectures or algorithms
You will work closely with research scientists, agent engineers, and platform teams to operationalize post-training and feedback-driven learning methods into robust, reusable systems
This is a lead-level individual contributor role with deep ownership of model-facing infrastructure and strong cross-functional influence
Design, build, and maintain infrastructure for LLM post-training, evaluation, and deployment
Own scalable pipelines for training orchestration, rollout generation, reward and feedback processing, checkpointing, and experiment management
Build reliable systems for feedback-driven model improvement, including human or AI feedback loops, large-scale offline evaluation, and regression detection
Partner closely with research scientists to turn new post-training methods into reusable engineering workflows
Collaborate with agent engineers and platform teams to integrate training and evaluation systems with production model and agent stacks
Optimize distributed training and inference workloads for reliability, throughput, cost efficiency, and observability
Drive best practices for reproducibility, versioning, monitoring, deployment, and operational excellence across ML systems

Benefícios

Competitive compensation, benefits, and strong long-term growth opportunities

The same goes for compensation, benefits, promotions, transfers, reduction in workforce, recall, training, and education

In the United States, compensation offered will be determined by factors such as location, job level, job-related knowledge, skills, and experience

Certain roles may be eligible for incentive compensation, equity, and benefits

The typical base salary range for this position is $172,500 - $260,100 annually

In select cities within the San Francisco and New York City metropolitan area, the base salary range for this role is $207,800 - $285,800 annually

The range represents base salary only, and does not include company bonus, incentive for sales roles, equity or benefits, as applicable

Vagas Semelhantes

Software Engineer- Sr. Consultant (GenAI/Cloud)

VisaTeal

RemotoBellevue, Washington, Us2 dias atrás

R$ 12k - 19k/mês

SêniorCLT

the position Visa’s Technology Organization is a community of problem solvers and innovators reshaping the future of commerce. We operate the world’s most sophisticated processing networks capable of handling more than 65k secure transactions a second across 80M merchants, 15k Financial Institutions...

React Python Java Go MySQL+12

MedicalDentalVision

Ver Detalhes

Tech Lead Staff Full Stack Software Engineer (Go / Node.js / React)

Growth Acceleration PartnersTalent.com

RemotoColorado, Us5 dias atrás

R$ 13k - 19k/mês

SêniorCLT

WHAT WE DO Founded in 2007, Growth Acceleration Partners (GAP) is a consulting and technology services company. We consult, design, build and modernize revenue-generating software and data engineering solutions for clients. With modernization services and AI tools, we help businesses achieve a compe...

TypeScript React Node Go MySQL+15

Ver Detalhes

Senior Backend Developer – Fintech & Payment Orchestration(High-Risk Payments, iGaming & Fintech)

Payomatix TechnologiesTalents By StudySmarter

RemotoHull, Iowa, Us15 dias atrás

R$ 13k - 19k/mês

SêniorCLT

Location: Remote / Hybrid Department: Technology Salary : 3000 USD / per month Company Description Payomatix is a Dubai-based fintech company revolutionizing the way businesses manage and scale digital financial ecosystems. With expertise in payment orchestration, white-label financial products, emb...

Node Python Go Rust PostgreSQL+12

Payments & SecurityPCI-DSS Compliant ArchitectureTokenization systems

Ver Detalhes

AI & LLM Developer — Senior

Open InsuranceIndeed

RemotoRemoto18 dias atrás

R$ 16k - 25k/mês

SêniorCLT

Location: Remote or Hybrid (if US Located) Employment Type: Contract — Full-Time Department: Engineering / Product Development Experience Level: Senior (5–8+ years) Reports To: Director of Engineering Role Overview We are seeking a highly skilled Senior AI & LLM Developer with deep, hands-on experie...

JavaScript TypeScript Python Java Go+15

Competitive contract compensation commensurate with experiencePay: From $4,000.00 per month

Ver Detalhes

Interessado nesta vaga?

Candidatar-se

Você será redirecionado para o site original

Informações

NívelSênior

ContratoCLT

LocalSan Francisco, California, Us

RemotoSim

MoedaBRL

Publicada6 dias atrás

FonteWorkday

Análise de Vaga com IA

Estimativa salarial, match de tecnologias e análise de requisitos feitos com Inteligência Artificial

← Voltar às Vagas