Logo Agile Global Solutions, Inc.

Data Engineer(Python+ML Exp Must)_Los Altos, CA (Hybrid/Remote)

Agile Global Solutions, Inc.via Monster
RemotoLos Altos, California, UsSêniorCLT15 dias atrás

Salário Estimado

R$ 12.870,00 - R$ 19.305,00

0de 100

Excelente

Score da Vaga

Descrição da Vaga

Position: Data Engineer – Autonomous Vehicle AI Research Infrastructure Location: Los Altos, CA Duration: Contract Job Description: At COMPANY we’re on a mission to improve the quality of human life.


We’re developing new tools and capabilities to amplify the human experience.


To lead this transformative shift in mobility, we’ve built a world-class team in Energy & Materials, Human-Centered AI, Human Interactive Driving, Large Behavioral Models, and Robotics.


Within the Human Interactive Driving division, the Extreme Performance Intelligent Control department is working to develop scalable, human-like driving intelligence by learning from expert human drivers.


This project focuses on creating a configurable, data-driven world model that serves as a foundation for intelligent, multi-agent reasoning in dynamic driving environments.


By tightly integrating advances in perception, world modeling, and model-based reinforcement learning, we aim to overcome the limitations of more compartmentalized, rule-based approaches.


The end goal is to enable robust, adaptable, and interpretable driving policies that generalize across tasks, sensor modalities, and public road scenarios—delivering transformative improvements for ADAS, autonomous systems, and simulation-driven software development.


As a Data Engineer, you will be a key enabler of this mission—owning the systems that collect, organize, clean, and deliver the volumes of sensor and simulation data that fuel our world models, perception systems, and reinforcement learning algorithms.


You will collaborate closely with research scientists and machine learning engineers to ensure our pipelines are reliable, scalable, and performant—powering breakthroughs in intelligent driving across simulation and real-world deployments.


Responsibilities ● Design, implement, and maintain robust data pipelines for ingesting, cleaning, and transforming large-scale autonomous vehicle datasets (camera, LiDAR, radar, GPS, simulation logs). ● Develop scalable storage and retrieval systems using AWS services (S3, EC2, SageMaker, Athena, etc.). ● Ensure data quality and consistency through automated validation, deduplication, and schema enforcement. ● Collaborate with ML researchers and engineers to provide efficient access to training data, labels, and metadata. ● Optimize data preprocessing and batching pipelines to support large-scale training and evaluation workflows. ● Build tools to manage and audit dataset versions, experiment tracking, and feature reproducibility. ● Implement and maintain CI/CD workflows for data and pipeline updates, ensuring minimal downtime and reproducible outputs. ● Monitor data pipeline performance and respond to bottlenecks or outages proactively.


Qualifications ● B.


S. or M.


S. in Computer Science, Data Engineering, or a related field. ● 3+ years of experience building production-grade data infrastructure or ML data pipelines. ● Strong proficiency with Python and SQL, and experience with data workflow orchestration tools (e.g., Airflow, Prefect, Luigi). ● Deep experience with AWS services, especially S3 (data storage), EC2 (compute), and SageMaker (model training). ● Familiarity with distributed computing frameworks like Spark, Dask, or Ray. ● Understanding of best practices for dataset documentation, standardization, and reproducibility in research.


Bonus Qualifications ● Experience with autonomous vehicle datasets or robotics sensor data. ● Familiarity with ML training pipelines and model evaluation workflows. ● Prior experience collaborating with researchers or applied ML teams in high-throughput environments.


Best Regards, T Chandra Sekhar - Technical Sr.


Recruiter Agile Global Solutions, Inc ....."Empowering Enterprises" 193 Blue Ravine Road, Suite 160, Folsom, CA 95630 Direct - 916-413-7282 [email protected] | www.agileglobal.com Remote Skills: Agile Programming Methodologies, Amazon Elastic Compute Cloud (EC2), Amazon Simple Storage Service (S3), Amazon Web Services (AWS), Apache Spark, Artificial Intelligence (AI), Auditing, Automotive Automation, Best Practices, Computer Science, Continuous Deployment/Delivery, Continuous Integration, Data Management, Data Quality, Data Sets, Distributed Computing, Documentation Standards, GPS (Global Positioning System), Human Interaction, Light Detection and Ranging (LiDAR)\Laser Detection and Ranging (LADAR), Machine Learning, Metadata, Performance Analysis, Power Amplifier, Python Programming/Scripting Language, Quality Management, Reinforcement Learning, Robotics, Scalable System Development, Scientific Research, Simulation, Software Development, Software Engineering, Software Simulation, Technical Recruiting, User Interface/Experience (UI/UX), Workflow Analysis About the Company: Agile Global Solutions, Inc.

Requisitos

  • B.S. or M.S. in Computer Science, Data Engineering, or a related field
  • 3+ years of experience building production-grade data infrastructure or ML data
  • Strong proficiency with Python and SQL, and experience with data workflow
  • orchestration tools (e.g., Airflow, Prefect, Luigi)
  • Deep experience with AWS services, especially S3 (data storage), EC2
  • (compute), and SageMaker (model training)
  • Familiarity with distributed computing frameworks like Spark, Dask, or Ray
  • Understanding of best practices for dataset documentation, standardization, and
  • reproducibility in research
  • Experience with autonomous vehicle datasets or robotics sensor data
  • Familiarity with ML training pipelines and model evaluation workflows
  • Prior experience collaborating with researchers or applied ML teams in
  • Agile Programming Methodologies, Amazon Elastic Compute Cloud (EC2), Amazon Simple Storage Service (S3), Amazon Web Services (AWS), Apache Spark, Artificial Intelligence (AI), Auditing, Automotive Automation, Best Practices, Computer Science, Continuous Deployment/Delivery, Continuous Integration, Data Management, Data Quality, Data Sets, Distributed Computing, Documentation Standards, GPS (Global Positioning System), Human Interaction, Light Detection and Ranging (LiDAR)\Laser Detection and Ranging (LADAR), Machine Learning, Metadata, Performance Analysis, Power Amplifier, Python Programming/Scripting Language, Quality Management, Reinforcement Learning, Robotics, Scalable System Development, Scientific Research, Simulation, Software Development, Software Engineering, Software Simulation, Technical Recruiting, User Interface/Experience (UI/UX), Workflow Analysis

Responsabilidades

  • overcome the limitations of more compartmentalized, rule-based approaches
  • The end goal is to enable robust, adaptable, and interpretable driving policies that generalize across tasks, sensor modalities, and public road scenarios—delivering transformative
  • improvements for ADAS, autonomous systems, and simulation-driven software development
  • As a Data Engineer, you will be a key enabler of this mission—owning the systems that collect, organize, clean, and deliver the volumes of sensor and simulation data that fuel our world models, perception systems, and reinforcement learning algorithms
  • You will collaborate closely with research scientists and machine learning engineers to ensure our pipelines are reliable, scalable, and performant—powering breakthroughs in intelligent driving across simulation and real-world deployments
  • Design, implement, and maintain robust data pipelines for ingesting, cleaning, and transforming large-scale autonomous vehicle datasets (camera, LiDAR, radar, GPS, simulation logs)
  • Develop scalable storage and retrieval systems using AWS services (S3, EC2, SageMaker, Athena, etc.)
  • Ensure data quality and consistency through automated validation, deduplication, and schema enforcement
  • Collaborate with ML researchers and engineers to provide efficient access to training data, labels, and metadata
  • Optimize data preprocessing and batching pipelines to support large-scale training and evaluation workflows
  • Build tools to manage and audit dataset versions, experiment tracking, and feature reproducibility
  • Implement and maintain CI/CD workflows for data and pipeline updates, ensuring minimal downtime and reproducible outputs
  • Monitor data pipeline performance and respond to bottlenecks or outages proactively

Vagas Semelhantes

RemotoSão Paulo6 dias atrás

R$ 16k - 23k/mês

SêniorCLT

Descrição da empresa Na Bosch, moldamos o futuro por meio das inovações tecnológicas de alta qualidade e de serviços que despertam entusiasmo e melhoram a vida das pessoas. Temos uma promessa sólida para nossos colaboradores: crescemos juntos, gostamos do nosso trabalho e inspiramos uns aos outros. ...

RemotoSão Paulo8 dias atrás

R$ 16k - 23k/mês

SêniorCLT

Descrição da empresa Na Bosch, moldamos o futuro por meio das inovações tecnológicas de alta qualidade e de serviços que despertam entusiasmo e melhoram a vida das pessoas. Temos uma promessa sólida para nossos colaboradores: crescemos juntos, gostamos do nosso trabalho e inspiramos uns aos outros. ...

Logo CareOregon

IS Data Engineer

CareOregonTeal
RemotoUs8 dias atrás

R$ 13k - 19k/mês

SêniorCLT

the position The IS Data Engineer plays a pivotal role in operationalizing and advancing data and analytics for CareOregon’s business initiatives. This involves building, managing and optimizing data pipelines and moving them effectively into production for data and analytics consumers. Consumers in...

We offer a strong Total Rewards ProgramThis includes competitive pay, bonus opportunity, and a comprehensive benefits packageEligibility for bonuses and benefits is dependent on factors such as the position type and the number of scheduled weekly hours
Recife, Pernambuco, Br7 dias atrás

R$ 13k - 20k/mês

SêniorCLT

Olá, nós somos o CESAR! Somos um centro de inovação e de educação que há quase 30 anos forma pessoas e impulsiona organizações, potencializando suas estratégias digitais. Resolvemos problemas complexos e desafiadores em um ambiente de trabalho descontraído, descentralizado e repleto de benefícios pa...

Interessado nesta vaga?

Candidatar-se

Você será redirecionado para o site original

Informações

NívelSênior
ContratoCLT
LocalLos Altos, California, Us
RemotoSim
MoedaBRL
Publicada15 dias atrás
FonteMonster

Análise de Vaga com IA

Estimativa salarial, match de tecnologias e análise de requisitos feitos com Inteligência Artificial

Quer se preparar melhor? Pratique entrevistas com IA no Recrutadoria ou melhore suas habilidades no BitMentor

← Voltar às Vagas