[Remote] Google Cloud Platform Data Engineer (Health Care) with Python/Pyspark Coding Must_REMOTE@Full Time

Jobs via Dicevia Jobright

RemotoUsPlenoCLTOntem

Salário Estimado

R$ 7.128,00 - R$ 10.692,00

Tecnologias

Python Go AWS IA

0de 100

Regular

Score da Vaga

Descrição da Vaga

Note: The job is a remote job and is open to candidates in USA.

Dice is the leading career destination for tech experts at every stage of their careers.

Our client, BURGEON IT SERVICES LLC, is seeking a Google Cloud Platform Data Engineer with a healthcare background.

The role involves architecting and building a Google Cloud Platform-based data lake and data warehouse ecosystem, requiring expertise in data ingestion, transformation, and governance.

Responsibilities • Architect and design an enterprise-grade Google Cloud Platform-based data lakehouse leveraging BigQuery, GCS, Dataproc, Dataflow, Pub/Sub, Cloud Composer, and BigQuery Omni

•Define data ingestion, hydration, curation, processing and enrichment strategies for large-scale structured, semi-structured, and unstructured datasets • Create data domain models, canonical models, and consumption-ready datasets for analytics, AI/ML, and operational data products

•Design federated data layers and self-service data products for downstream consumers • Architect batch, near-real-time, and streaming ingestion pipelines using Google Cloud Platform Cloud Dataflow, Pub/Sub, and Dataproc

•Set up data ingestion for clinical (EHR/EMR, LIS, RIS/PACS) datasets including HL7, FHIR, CCD, DICOM formats • Build ingestion pipelines for non-clinical systems (ERP, HR, payroll, supply chain, finance)

•Architect ingestion from medical devices, IoT, remote patient monitoring, and wearables leveraging IoMT patterns • Manage on-prem cloud migration pipelines, hybrid cloud data movement, VPN/Interconnect connectivity, and data transfer strategies

•Build transformation frameworks using BigQuery SQL, Dataflow, Dataproc, or dbt • Define curation patterns including bronze/silver/gold layers, canonical healthcare entities, and data marts

•Implement data enrichment using external social determinants, device signals, clinical event logs, or operational datasets • Enable metadata-driven pipelines for scalable transformations

•Establish and operationalize a data governance framework encompassing data stewardship, ownership, classification, and lifecycle policies • Implement data lineage, data cataloging, and metadata management using tools such as Dataplex, Data Catalog, Collibra, or Informatica

•Set up data quality frameworks for validation, profiling, anomaly detection, and SLA monitoring • Ensure HIPAA compliance, PHI protection, IAM/RBAC, VPC SC, DLP, encryption, retention, and auditing

•Work with cloud infrastructure teams to architect VPC networks, subnetting, ingress/egress, firewall policies, VPN/IPSec, Interconnect, and hybrid connectivity • Define storage layers, partitioning/clustering design, cost optimization, performance tuning, and capacity planning for BigQuery

•Understand containerized processing (Cloud Run, GKE) for data services • Work closely with clinical, operational, research, and IT stakeholders to define data use cases, schema, and consumption models

•Partner with enterprise architects, security teams, and platform engineering teams on cross-functional initiatives • Guide data engineers and provide architectural oversight on pipeline implementation

•Be actively hands-on in building pipelines, writing transformations, building POCs, and validating architectural patterns • Mentor data engineers on best practices, coding standards, and cloud-native development Skills

•10+ years in data architecture, engineering, or data platform roles • Strong expertise in Google Cloud Platform data stack (BigQuery, Dataflow, Composer, GCS, Pub/Sub, Dataproc, Dataplex)

•Hands-on Experience With Data Ingestion, Pipeline Orchestration, And Transformations • Deep understanding of clinical data standards: HL7 v2.x, FHIR, CCD/C-CDA

•DICOM (for scans and imaging) • LIS/RIS/PACS data structures

•Experience with device and IoT data ingestion (wearables, remote patient monitoring, clinical devices) • Experience With ERP Datasets (Workday, Oracle, Lawson, PeopleSoft)

•Strong SQL and data modeling skills (3NF, star/snowflake, canonical and logical models) • Experience With Metadata Management, Lineage, And Governance Frameworks

•Solid understanding of HIPAA, PHI/PII handling, DLP, IAM, VPC security • Solid understanding of cloud networking, hybrid connectivity, VPC design, firewalling, DNS, service accounts, IAM, and security models

•Cloud Native Data movement services • Experience with data governance frameworks

•Experience with data quality frameworks for validation, profiling, anomaly detection, and SLA monitoring Company Overview • Welcome to Jobs via Dice, the go-to destination for discovering the tech jobs you want.

It was founded in undefined, and is headquartered in , with a workforce of 0-1 employees.

Its website is https://www.dice.com.

Requisitos

10+ years in data architecture, engineering, or data platform roles
Strong expertise in Google Cloud Platform data stack (BigQuery, Dataflow, Composer, GCS, Pub/Sub, Dataproc, Dataplex)
Hands-on Experience With Data Ingestion, Pipeline Orchestration, And Transformations
Deep understanding of clinical data standards: HL7 v2.x, FHIR, CCD/C-CDA
DICOM (for scans and imaging)
LIS/RIS/PACS data structures
Experience with device and IoT data ingestion (wearables, remote patient monitoring, clinical devices)
Experience With ERP Datasets (Workday, Oracle, Lawson, PeopleSoft)
Strong SQL and data modeling skills (3NF, star/snowflake, canonical and logical models)
Experience With Metadata Management, Lineage, And Governance Frameworks
Solid understanding of HIPAA, PHI/PII handling, DLP, IAM, VPC security
Solid understanding of cloud networking, hybrid connectivity, VPC design, firewalling, DNS, service accounts, IAM, and security models
Cloud Native Data movement services
Experience with data governance frameworks
Experience with data quality frameworks for validation, profiling, anomaly detection, and SLA monitoring

Responsabilidades

The role involves architecting and building a Google Cloud Platform-based data lake and data warehouse ecosystem, requiring expertise in data ingestion, transformation, and governance
Architect and design an enterprise-grade Google Cloud Platform-based data lakehouse leveraging BigQuery, GCS, Dataproc, Dataflow, Pub/Sub, Cloud Composer, and BigQuery Omni
Define data ingestion, hydration, curation, processing and enrichment strategies for large-scale structured, semi-structured, and unstructured datasets
Create data domain models, canonical models, and consumption-ready datasets for analytics, AI/ML, and operational data products
Design federated data layers and self-service data products for downstream consumers
Architect batch, near-real-time, and streaming ingestion pipelines using Google Cloud Platform Cloud Dataflow, Pub/Sub, and Dataproc
Set up data ingestion for clinical (EHR/EMR, LIS, RIS/PACS) datasets including HL7, FHIR, CCD, DICOM formats
Build ingestion pipelines for non-clinical systems (ERP, HR, payroll, supply chain, finance)
Architect ingestion from medical devices, IoT, remote patient monitoring, and wearables leveraging IoMT patterns
Manage on-prem cloud migration pipelines, hybrid cloud data movement, VPN/Interconnect connectivity, and data transfer strategies
Build transformation frameworks using BigQuery SQL, Dataflow, Dataproc, or dbt
Define curation patterns including bronze/silver/gold layers, canonical healthcare entities, and data marts
Implement data enrichment using external social determinants, device signals, clinical event logs, or operational datasets
Enable metadata-driven pipelines for scalable transformations
Establish and operationalize a data governance framework encompassing data stewardship, ownership, classification, and lifecycle policies
Implement data lineage, data cataloging, and metadata management using tools such as Dataplex, Data Catalog, Collibra, or Informatica
Set up data quality frameworks for validation, profiling, anomaly detection, and SLA monitoring
Ensure HIPAA compliance, PHI protection, IAM/RBAC, VPC SC, DLP, encryption, retention, and auditing
Work with cloud infrastructure teams to architect VPC networks, subnetting, ingress/egress, firewall policies, VPN/IPSec, Interconnect, and hybrid connectivity
Define storage layers, partitioning/clustering design, cost optimization, performance tuning, and capacity planning for BigQuery
Understand containerized processing (Cloud Run, GKE) for data services
Work closely with clinical, operational, research, and IT stakeholders to define data use cases, schema, and consumption models
Partner with enterprise architects, security teams, and platform engineering teams on cross-functional initiatives
Guide data engineers and provide architectural oversight on pipeline implementation
Be actively hands-on in building pipelines, writing transformations, building POCs, and validating architectural patterns
Mentor data engineers on best practices, coding standards, and cloud-native development

Vagas Semelhantes

PDI SW – Pessoa Líder Técnica III (Inteligência Artificial)

InatelEmployed

RemotoBrHoje

R$ 8k - 12k/mês

PlenoCLT

Descrição da vagaO Inatel está recrutando Especialista para atuar na liderança técnica de projetos de Inteligência Artificial, Machine Learning e Data Science no Inatel Competence Center - PDI SW.Responsabilidades e atribuiçõesDefinir a estratégia técnica do projeto, desenvolver e comunicando uma vi...

Python Go AWS Azure iOS+5

Ver Detalhes

PDI SW - Pessoa Líder Técnica III (Inteligência Artificial)

Instituto Nacional de Telecomunicações - InatelInatel - Gupy

RemotoBrHoje

R$ 8k - 12k/mês

PlenoCLT

Descrição da vaga O Inatel está recrutando Especialista para atuar na liderança técnica de projetos de Inteligência Artificial, Machine Learning e Data Science no Inatel Competence Center - PDI SW. Responsabilidades e atribuições • Definir a estratégia técnica do projeto, desenvolver e comunicando u...

Python Go AWS Azure iOS+5

Ver Detalhes

Engenheiro(a) de Software Sr - Python

Nagro Crédito AgroGlassdoor

RemotoUberlândia, Minas Gerais, BrHoje

R$ 7k - 11k/mês

PlenoCLT

O AgRisk é um HUB completo de inteligência que reúne dados, tecnologia, governança e análise para apoiar todas as etapas da jornada do crédito agro. Como solução pioneira no setor, estabelece um novo padrão de eficiência, simplicidade e precisão, tornando as decisões mais rápidas, seguras e consiste...

TypeScript Node Python Go PostgreSQL+8

Ver Detalhes

Desenvolvedor Backend Pleno

MAISTODOSGlassdoor

RemotoRemoto2 dias atrás

R$ 9k - 14k/mês

PlenoCLT

Somos a MaisTODOS, fintech do grupo TODOS. Por aqui selecionamos os melhores talentos para as melhores vagas no mercado digital. Dedicamo-nos a oferecer soluções financeiras que proporcionem experiências únicas por meio de nossos produtos e serviços. Para isso temos uma equipe incrível, com brilho n...

Python Go AWS Git API+2

Ver Detalhes

Interessado nesta vaga?

Candidatar-se

Você será redirecionado para o site original

Informações

NívelPleno

ContratoCLT

LocalUs

RemotoSim

MoedaBRL

PublicadaOntem

FonteJobright

Análise de Vaga com IA

Estimativa salarial, match de tecnologias e análise de requisitos feitos com Inteligência Artificial

← Voltar às Vagas