Overview
The Databricks Data Engineer will be part of the Data Services team and help transform the delivery of data driven insights at scale. In this role, they will design and engineer robust data pipelines using technologies like Databricks, Azure Data Factory, Apache Spark, and Delta Lake. This role will work hands on crafting healthcare data solutions - processing massive healthcare datasets, optimizing performance, and ensuring our data is accurate, secure, and accessible when it matters most.
Essential Functions
Translate business requirements into technical specifications and document solution designs, data flows and architecture
Design, develop, and maintain ETL/ELT pipelines using Azure Data Factory, Databricks and Apache Spark
Implement Delta Lake architecture for reliable data storage and processing
Build and optimize data workflows using Databricks Workflows and Jobs
Develop scalable data models following medallion architecture (bronze, silver, gold layers)
Implement Unity Catalog for data governance, access control, and metadata management
Create and maintain Databricks notebooks for data transformation and analysis
Optimize Spark jobs for performance and cost efficiency
Implement data quality checks and validation frameworks
Collaborate with BI developers, data analysts, and data scientists
Design and implement data orchestration workflows using Azure Data Factory to coordinate complex ETL/ELT processes
Develop and maintain CI/CD pipelines for data workflows
Monitor data pipeline performance and troubleshoot issues
Document data processes, architectures, and best practices
Ensure compliance with data security and privacy regulations
Provide support for new and existing solutions
About You
Knowledge/Skills/Abilities/Expectations:
Excellent problem-solving and analytical skills
Strong oral and written communication abilities
Self-motivated with ability to adapt to new technologies quickly
Team player with ability to work independently
Detail-oriented with strong organizational skills
Ability to manage multiple priorities and meet deadlines
Experience communicating technical concepts to non-technical stakeholders
Technical Skills:
Databricks Platform:
Expert-level knowledge of Databricks Workspace, clusters, and notebooks
Delta Lake implementation and optimization
Unity Catalog for data governance and cataloging
Databricks SQL and SQL Analytics
Databricks Workflows, Delta Live Tables, and job orchestration
Delta Live Tables (DLT) for pipeline orchestration and data quality
Programming & Development:
Advanced Python programming (PySpark, pandas, NumPy)
Advanced SQL (query optimization, performance tuning)
Scala programming (preferred)
Git version control and collaborative development
Cloud Technologies:
Azure Databricks
Cloud storage services (ADLS Gen2, Azure Blob Storage)
Azure Data Factory for pipeline orchestration and integration
Experience designing and managing Azure Data Factory pipelines, triggers, and linked services
Infrastructure as Code (Terraform)
Business Intelligence & Analytics:
Experience with BI tools (Power BI, SSRS)
Data warehousing and data modeling concepts
SQL Server, including SSIS (Integration Services)
MLflow for ML lifecycle management (plus)
Preferred Additional Skills
Experience with complex data modeling including dimensional modeling, star/snowflake schemas
Experience with medallion architecture (bronze/silver/gold layers)
Data quality and validation framework implementation
CI/CD pipeline development for data workflows (Azure DevOps)
Performance tuning and cost optimization
DataOps and DevOps practices
Education/Experience:
Bachelor's degree in Computer Science, Information Technology or related field
5+ years of progressive experience in data engineering, analytics, or software development
3+ years of hands-on experience with Databricks platform
Strong experience with Apache Spark and PySpark
Healthcare IT or healthcare data experience preferred
Licenses/Certification:
Databricks Certified Data Engineer Associate (strongly preferred)
Databricks Certified Data Engineer Professional
Databricks Lakehouse Fundamentals
Azure Data Engineer Associate (DP-203)
Apache Spark certifications
We Offer
Comprehensive Benefits Package: Health Insurance, 401k Plan, Tuition Reimbursement, PTO
Opportunity to participate in a Fleet Program
Competitive Salaries
Mileage Reimbursement
Professional growth and development opportunities
Legalese
This is a safety-sensitive position
Employee must meet minimum requirements to be eligible for benefits
Where applicable, employee must meet state specific requirements
We are proud to be an EEO employer
We maintain a drug-free workplace
ReqID: 2026-134012
Category: Corporate
Position Type: Full-Time
Company: Gentiva