Descrição da vaga

Texto agregado para leitura rápida. Confira sempre a fonte original ao enviar a candidatura.

At Digibee, we aren’t just building technology; we are unlocking the innovation potential of the world’s largest companies by making the complex simple.

In an integration market valued at approximately $250 billion, our cloud-native, low-code iPaaS platform empowers every developer to build and monitor end-to-end workflows, eliminating technical debt and accelerating digital transformation.

Why join us?

Founded in Brazil in 2017, we are now a global team distributed across the Americas, driven by a culture of flexibility and autonomy.

Following a $60 million Series B funding round, we are in full global expansion, combining the agility of a startup with the stability of a company backed by major global players.

Here, you don’t just witness growth, you are the engine behind it.

If you seek real impact and want to redefine how the world connects, you belong here.

Let’s simplify the world, one integration at a time.

About the role:

We are looking for a Site Reliability Engineer Specialist to be the technical anchor for observability and incident response across the Digibee platform. This is a senior individual contributor role with significant cross-team influence over how we instrument, monitor, alert on, and recover from issues in a complex distributed system — primarily Java with some Node.js services running on GKE, fronted by Kong, and backed by RabbitMQ, PostgreSQL, MongoDB Atlas, Redis, MinIO, and an Elasticsearch/Logstash/Fluent Bit logging pipeline.

You will set the bar for reliability engineering at Digibee — defining our SLO culture, evolving our Dash0/OpenTelemetry-based observability framework, leading major incident response, and mentoring engineers across SRE, platform, and product teams.

Responsibilities And Attributions

On a typical day, you will…

  • Own the technical direction of our observability stack (Dash0, OpenTelemetry, Elasticsearch/Logstash/Fluent Bit) — defining instrumentation standards for Java and Node.js services and driving adoption of tracing, metrics, and structured logging.
  • Establish meaningful SLIs, SLOs, and error budgets, and partner with engineering and product teams to use them to drive real engineering decisions.
  • Lead major incident response as a senior incident commander, and run blameless postmortems with technical depth and real follow-through.
  • Evolve our on-call program so it is humane and sustainable — driving down toil and alert noise as a first-class engineering priority.
  • Influence architecture decisions across the platform, going deep where it matters: GKE, Kong, RabbitMQ, PostgreSQL, MongoDB Atlas, Redis, and MinIO.
  • Mentor SREs and platform engineers, raise the technical bar through design and incident reviews, and grow the SRE discipline at Digibee.


Requisites And Qualifications

What you'll need to bring…

  • 8+ years in SRE, infrastructure, or platform engineering, with meaningful time at Specialist or Principal level operating large-scale production systems — this is a mandatory requirement.
  • Deep production experience with Kubernetes (preferably GKE), including real fluency debugging things under pressure.
  • Strong observability background with OpenTelemetry, Prometheus, distributed tracing, and centralized logging (Elasticsearch, Logstash, Fluent Bit, or similar). Experience with Dash0 is a strong plus.
  • Hands-on experience operating stateful services in production: at least two of PostgreSQL, MongoDB Atlas, Redis, RabbitMQ, or object storage (MinIO/S3).
  • Production experience instrumenting and troubleshooting Java services (JVM tuning, GC, thread dumps); familiarity with Node.js runtime characteristics is a plus.
  • Proven track record leading incident response and SLO programs that actually changed engineering behavior — not dashboards nobody looks at.
  • Demonstrated ability to mentor senior engineers and influence technical direction across teams without formal authority.
  • Strong communication skills in both English and Portuguese (written and verbal), with proven ability to collaborate across cross-functional, remote-first teams.


Addtional Informations

It's a plus if you have…

  • Experience operating an iPaaS or similarly multi-tenant runtime where customer workloads are first-class.
  • Experience with Kong API Gateway and Apache Camel at scale.
  • Experience with FluxCD, GitLab CI, and GitOps workflows.
  • Background contributing to OpenTelemetry, Prometheus, or related open source projects.
  • Familiarity with Chaos Engineering and Production Readiness Review programs.
  • CNCF Kubernetes certifications (CKA, CKS) or GCP Professional Cloud Engineer / Architect certifications.

Vagas relacionadas

Seleção por stack em comum com esta oportunidade

B
LinkedIn
Match50%

Desenvolvedor Web – Porto Alegre – RS

BuscarVagas - Empregos Brasil Porto Alegre, Rio Grande Do Sul, Brazil 25 candidaturas Hoje

Salário estimado

R$ 7k - 11k/mês

Pleno CLT

Sobre a EmpresaDivulga VagasLocalização: Porto Alegre-RSDetalhes da VagaÁrea de Atuação: Informática / TI / TecnologiaPrincipais ResponsabilidadesAtribuições de acordo com o cargo.RequisitosRequisitos e Qualificações:Tecnologia da Informação;Análise e Desenvolvimento de Sistemas;Áreas afins;Desejáve...

Ver Detalhes
I
LinkedIn
Match50%

Desenvolvedor(a) Back-end Node.js Pleno

iK São Paulo 25 candidaturas Hoje

Salário estimado

R$ 8k - 12k/mês

Pleno CLT

DESCRIÇÃOEstamos em busca de um(a) profissional para atuar no desenvolvimento, manutenção e evolução de aplicações back-end, contribuindo para a sustentação e melhoria contínua dos sistemas e processos.Principais Atividades Desenvolver, manter e evoluir aplicações back-end utilizando Node.js; Criar ...

Ver Detalhes
A
LinkedIn
Match50%

Desenvolvedor(a) Back End

Avanade São Paulo 25 candidaturas Hoje

Salário estimado

R$ 8k - 12k/mês

Pleno CLT

Junte-se a nós na engenharia de software, automatizando sistemas empresariais com tecnologia de ponta e uma forte visão de negócios, moldando o futuro juntos!Bem-vindo ao universo do desenvolvimento Back-End! Aqui você construirá e automatizará funcionalidades empresariais, modelando soluções comple...

Ver Detalhes