Descrição da vaga

Texto agregado para leitura rápida. Confira sempre a fonte original ao enviar a candidatura.

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Site Reliability Engineer Specialist based in Brazil.

This role is a senior technical leadership opportunity focused on defining and elevating reliability practices across a complex, distributed cloud-native platform. You will be responsible for shaping observability, incident response, and SRE standards across large-scale systems running in Kubernetes (GKE) and supported by a modern microservices ecosystem. The environment includes critical components such as messaging, databases, API gateways, and logging pipelines, requiring deep systems thinking and strong operational discipline. This is a highly influential individual contributor position, where you will set the benchmark for SRE excellence, drive SLO adoption, and reduce operational toil at scale. You will also play a key role in major incident management and postmortem culture. The role offers strong cross-team visibility and the opportunity to shape how reliability engineering is practiced across the entire platform.

Accountabilities

  • Define and own the technical strategy for observability across the platform, including metrics, logs, and distributed tracing using tools such as OpenTelemetry and Dash0.
  • Establish and evolve SLIs, SLOs, and error budgets, ensuring they drive engineering and product decision-making.
  • Lead major incident response efforts as incident commander, ensuring structured resolution and blameless postmortems with actionable outcomes.
  • Improve on-call practices by reducing alert noise, minimizing toil, and building a sustainable operational model.
  • Influence and support architectural decisions across distributed systems including GKE, Kong, RabbitMQ, PostgreSQL, MongoDB Atlas, Redis, and MinIO.
  • Mentor SRE and platform engineers, raising the overall maturity of reliability engineering practices across teams.
  • Drive adoption of observability and reliability best practices across Java and Node.js services in production.

Requirements

  • 8+ years of experience in SRE, infrastructure, or platform engineering, with senior or specialist-level exposure to large-scale production environments.
  • Strong hands-on experience with Kubernetes (preferably GKE), including debugging and operating production workloads.
  • Deep expertise in observability systems (OpenTelemetry, Prometheus, centralized logging such as Elasticsearch, Logstash, Fluent Bit).
  • Experience defining and operationalizing SLIs, SLOs, and error budgets in real-world environments.
  • Strong background in incident management, including leading high-severity incidents and postmortem processes.
  • Experience operating distributed stateful systems such as PostgreSQL, MongoDB Atlas, Redis, RabbitMQ, or object storage (S3/MinIO).
  • Production experience with Java services (JVM tuning, performance troubleshooting) and familiarity with Node.js environments.
  • Proven ability to influence engineering teams and mentor senior engineers without formal authority.
  • Strong communication skills in English and Portuguese, with experience working in distributed, cross-functional teams.

Nice To Have

  • Experience with iPaaS or multi-tenant distributed platforms.
  • Knowledge of Kong API Gateway, Apache Camel, or similar integration technologies.
  • Experience with GitOps tools such as FluxCD or GitLab CI.
  • Exposure to Chaos Engineering or Production Readiness Review frameworks.
  • CNCF or cloud certifications (CKA, CKS, GCP Professional certifications).
  • Contributions to open-source observability or Kubernetes ecosystems.

Benefits

  • Health and dental care coverage.
  • Monthly flexible benefits via Caju card (R$ 1.400, covering food, mobility, home office, wellness, and education).
  • Life insurance.
  • Childcare assistance.
  • Equity (RSUs).
  • Gympass partnership for wellness and fitness.
  • English classes at a subsidized group rate.
  • Collaborative and flexible remote-first work environment.
  • Strong engineering culture focused on learning, autonomy, and impact.

How Jobgether Works

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses and identifying potential inconsistencies or verification signals in application materials based on available information. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Vagas relacionadas

Seleção por stack em comum com esta oportunidade

B
LinkedIn
Match50%

Intermediate Backend Developer

BEES São Paulo 25 candidaturas Hoje

Salário estimado

R$ 15k - 23k/mês

Sênior CLT

About BEESJoin us to build the future of B2B commerce!BEES is AB InBev’s B2B platform. Through our ecosystem, merchants and retailers across 29 countries can stock their businesses quickly, easily, and securely. At BEES, we dream big, lead with purpose, and develop technology that transforms the way...

Ver Detalhes
J
LinkedIn
Match50%

Desenvolvedor(a) de Sistemas | BACKEND

Jobgether Brazil 25 candidaturas Hoje

Salário estimado

R$ 6k - 10k/mês

Pleno CLT

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Desenvolvedor(a) de Sistemas | BACKEND based in Brazil.This role is focused on building, evolving, and maintaining scalable backend systems that power critical digital se...

Ver Detalhes
Z
LinkedIn
Match50%

Desenvolvedor Full Stack Nodejs - Porto Alegre - Profissional Procurado

ZANC Assessoria Nacional de Cobrança Porto Alegre, Rio Grande Do Sul, Brazil 25 candidaturas Hoje

Salário estimado

R$ 13k - 19k/mês

Sênior CLT

Zanc Acessoria Nacional de Cobrança Porto Alegre-RS PresencialÁrea: Informática / TI / Tecnologia A CombinarRequisitosExperiência com NodeJS, Express, Middlewares e ReactJS. Experiência com MongoDb, Redis e PostgreSQL. Experiência no desenvolvimento de micro serviçoes, integrações e apis RestFul. Ex...

Ver Detalhes