Logo MillenniumSoft Inc

LLM - Full Stack Python + JS - Remote (Stron Exp in LLM Coding tools)

MillenniumSoft Incvia LinkedIn
RemotoSan Diego, California, UsPlenoCLTHoje

Salário Estimado

R$ 7.722,00 - R$ 11.583,00

0de 100

Excelente

Score da Vaga

Descrição da Vaga

Role: LLM - Full Stack Python + JS Geo: LATAM, USA, Europe, West Africa YoE: 6+ years Engagement Type: Fulltime, 40h/week Project duration: 3 Months Total number of positions: 5 Start Date: Immediate(Next week ) Vetting: Two rounds of interviews (90 min technical round on Flocareer and a 15 min cultural &, offer discussion) Skill: Python, JavaScript / Node.js, TypeScript Availability: 40 hours per week with 4 hours of overlap with PST.


Role Overview: We’re a coding-focused team at Turing that serves as a research partner for a Frontier AI Lab.


Our role is to build coding tasks, evaluations, datasets, and tooling that help train and improve large language models (LLMs).


You’ll write and debug production-quality code, design rigorous evaluations, and build reproducible workflows that generate clean, high-signal data for model training.


Attention to detail matters deeply here—small mistakes can cascade into misleading results, so precision and thoroughness are essential.


You’ll also collaborate closely with engineers, researchers, and quality owners to align on standards, review work, and continuously raise the quality bar.


If you enjoy solving unusual technical problems, investigating subtle model failures, and working in developer-like environments where correctness, reproducibility, and collaboration matter, this role will keep you very entertained.


What does your day-to-day look like : Write, review, and debug code across multiple languages Design tasks and evaluation scenarios for coding, reasoning, and debugging Investigate LLM outputs and identify hallucinations, regressions, and failure modes Build reproducible dev environments using Docker + automation tools Develop scripts, pipelines, and tools for data generation, scoring, and validation Produce structured annotations, judgments, and high-quality datasets Run systematic evaluations that help improve model reliability and reasoning Required Skills : Experience using LLM coding tools (Cursor, Copilot, Code Whisperer)Strong hands-on coding experience (professional or research-based) in one or more of: Python, JavaScript / Node.js, TypeScript (Additional languages like Go, Java, C++, C#, Rust, SQL, R, Dart, etc. are a plus) Solid experience with Linux + Bash, scripting, and automation Strong with Docker, reproducible environments, and dev containers Advanced Git skills (branching, diffs, patches, conflict resolution) Solid understanding of testing and QA (unit, integration, negative, edge-case focused) Ability to reliably overlap with 8am-12pm PT Nice-to-Haves: Experience using LLM coding tools (Cursor, Copilot, Code Whisperer) Experience with dataset creation, annotation, evaluation, or ML pipelines Familiarity with benchmarks like SWE Bench or Terminal Bench Background in QA automation, DevOps, ML systems, or data engineering Who Thrives Here: Engineers who enjoy breaking things and understanding why People who like designing tasks, running experiments, and debugging Detail-oriented folks who can spot subtle issues in code or model behavior Engineers who like building clean, reusable workflows rather than one-off hacks Preferred Background: Bachelor’s degree in a technical field with 6+ years’ experience Master’s degree in a technical field with 4+ years’ experience PhD in a technical field with 2+ years’ experience Offer Details: Commitments Required: 8 hours per day with overlap of 4 hours with PST.


Engagement type: Contractor assignment (no medical/paid leave) Duration of contract: 3 months; [expected start date is next week] Location: West Africa, LATAM, North America, South America.


Evaluation Process (approximately 75 mins): Two rounds of interviews (90 min technical round and a 15 min cultural &, offer discussion)

Requisitos

  • Role: LLM - Full Stack Python + JS Geo: LATAM, USA, Europe, West Africa YoE: 6+ years Engagement Type: Fulltime, 40h/week Project duration: 3 Months Total number of positions: 5 Start Date: Immediate(Next week ) Vetting: Two rounds of interviews (90 min technical round on Flocareer and a 15 min cultural &, offer discussion) Skill: Python, JavaScript / Node.js, TypeScript Availability: 40 hours per week with 4 hours of overlap with PST
  • Required Skills : Experience using LLM coding tools (Cursor, Copilot, Code Whisperer)Strong hands-on coding experience (professional or research-based) in one or more of: Python, JavaScript / Node.js, TypeScript (Additional languages like Go, Java, C++, C#, Rust, SQL, R, Dart, etc
  • Advanced Git skills (branching, diffs, patches, conflict resolution) Solid understanding of testing and QA (unit, integration, negative, edge-case focused) Ability to reliably overlap with 8am-12pm PT
  • Nice-to-Haves: Experience using LLM coding tools (Cursor, Copilot, Code Whisperer) Experience with dataset creation, annotation, evaluation, or ML pipelines Familiarity with benchmarks like SWE Bench or Terminal Bench Background in QA automation, DevOps, ML systems, or data engineering
  • Who Thrives Here: Engineers who enjoy breaking things and understanding why People who like designing tasks, running experiments, and debugging Detail-oriented folks who can spot subtle issues in code or model behavior Engineers who like building clean, reusable workflows rather than one-off hacks
  • Offer Details: Commitments Required: 8 hours per day with overlap of 4 hours with PST. Engagement type: Contractor assignment (no medical/paid leave) Duration of contract: 3 months; [expected start date is next week] Location: West Africa, LATAM, North America, South America
  • Evaluation Process (approximately 75 mins): Two rounds of interviews (90 min technical round and a 15 min cultural &, offer discussion)

Responsabilidades

  • Our role is to build coding tasks, evaluations, datasets, and tooling that help train and improve large language models (LLMs)
  • You’ll write and debug production-quality code, design rigorous evaluations, and build reproducible workflows that generate clean, high-signal data for model training
  • Attention to detail matters deeply here—small mistakes can cascade into misleading results, so precision and thoroughness are essential
  • You’ll also collaborate closely with engineers, researchers, and quality owners to align on standards, review work, and continuously raise the quality bar
  • If you enjoy solving unusual technical problems, investigating subtle model failures, and working in developer-like environments where correctness, reproducibility, and collaboration matter, this role will keep you very entertained
  • What does your day-to-day look like : Write, review, and debug code across multiple languages Design tasks and evaluation scenarios for coding, reasoning, and debugging Investigate LLM outputs and identify hallucinations, regressions, and failure modes Build reproducible dev environments using Docker + automation tools Develop scripts, pipelines, and tools for data generation, scoring, and validation Produce structured annotations, judgments, and high-quality datasets Run systematic evaluations that help improve model reliability and reasoning

Vagas Semelhantes

R$ 5k - 8k/mês

JúniorCLT

Descrição: Empresa brasileira de tecnologia especializada em soluções omnichannel, CRM e plataformas digitais, impulsionando a transformação digital de empresas e órgãos públicos, está em busca de um profissional para compor o time de inovação: Desenvolvedor(a) Full-Stack Junior (Node.js, JavaScript...

RemotoRemoto10 dias atrás

R$ 15k - 23k/mês

SêniorCLT

If you're passionate about technology and ready for a challenge, join our team of talented individuals and show your technical skills, creativity, and drive for impact. You'll work with the best in the industry in a supportive, happy, and human-centric environment, making seamless technology. That’s...

Logo Red Thread Innovations

Sr. Full Stack Engineer (3 Month Contract)

Red Thread InnovationsGlassdoor
RemotoBr11 dias atrás

R$ 12k - 18k/mês

SêniorCLT

Red Thread Innovations Red Thread Innovations (RTI) is a digital innovation firm with the sole mission of building meaningful digital products that delight consumers and transform businesses. Our vision is to be the global leader in digital product development, with the largest portfolio of successf...

T

Desenvolvedor/a Front-End / Angular

Target SistemasTarget Sistemas - Gupy
RemotoBrHoje

R$ 7k - 11k/mês

PlenoCLT

Descrição da vaga Uma empresa de tecnologia e serviços que vai além de criar softwares de gestão, estamos aqui para transformar o mercado de distribuição! Entregamos mais do que soluções tecnológicas; oferecemos inteligência, automação, suporte completo, e construímos relacionamentos sólidos com nos...

Interessado nesta vaga?

Candidatar-se

Você será redirecionado para o site original

Informações

NívelPleno
ContratoCLT
LocalSan Diego, California, Us
RemotoSim
MoedaBRL
PublicadaHoje
FonteLinkedIn

Análise de Vaga com IA

Estimativa salarial, match de tecnologias e análise de requisitos feitos com Inteligência Artificial

Powered by CodeCortex
← Voltar às Vagas