Lead Data Engineer Databricks, Python, SQL

THOUGHTWORKS VIETNAM

Mức lương

Đang cập nhật

Địa điểm làm việc

Quận 3, Hồ Chí Minh

Kinh nghiệm yêu cầu

Cập nhật

Thông tin cơ bản

Mô tả công việc

Lead data engineers at Thoughtworks develop modern data architecture approaches to meet key business objectives and provide end- to- end data solutions. They might spend a few weeks with a new client on a deep technical review or a complete organizational review, helping them to understand the potential that data brings to solve their most pressing problems. On projects, they will be leading the design of technical solutions, or perhaps overseeing a program inception to build a new product. Alongside hands- on coding, they are leading the team to implement the solution.
Job responsibilities

You will be responsible to create, design and develop intricate data processing pipelines, addressing clients&039; most challenging problems.
You will collaborate with data scientists to design scalable implementations of their models.
You will define the strategy for and incorporate data quality into your day- to- day work.
You will be responsible for data governance, data security and data privacy to support business and compliance requirements.
You write clean and iterative code based on TDD and leverage various continuous delivery practices to deploy, support and operate data pipelines.
You will lead and manage data engineering projects from inception to completion, including goal- setting, scope definition and ensuring on- time delivery with cross team collaboration.
You will lead and advise clients on how to use different distributed storage and computing technologies from the plethora of options available.
You will develop data models by selecting from a variety of modeling techniques and implementing the chosen data model using the appropriate technology stack.
You will collaborate with stakeholders to understand their strategic objectives and identify opportunities to leverage data and data quality.
You will design, develop and operate modern data architecture approaches to meet key business objectives and provide end- to- end data solutions.

Yêu cầu công việc

Technical Skills

Ability to define, architect, and implement modern data architecture patterns (Medallion, data mesh, data product approach).
Expert- level Databricks skills (SparkSQL, PySpark, Spark DataFrames) and open table formats (Delta Lake, Apache Iceberg).
Extensive experience designing and implementing highly scalable streaming and batch data ingestion frameworks (Kafka, Autoloader, APIs, SFTP) and data/file formats (CSV, JSON, YAML).
Deep expertise in columnar storage formats, advanced performance tuning, and optimization strategies (Parquet, ORC, Z- Order, clustering).
Mastery of dbt (core/cloud) and advanced SQL for complex analytical transformations, including performance optimization. Expertise in establishing and enforcing data quality, testing, and governance frameworks (Great Expectations, dbt tests, data contracts).
Leadership in defining and implementing DevOps & infrastructure- as- code strategies (GitLab/GitHub CI/CD, Terraform). Proven ability to design and implement comprehensive observability & monitoring solutions (logging, alerting, pipeline performance tracking).
Experience with event- driven architectures (AWS EventBridge, GCP Pub/Sub, Azure Event Grid).
Expert Python engineering skills, leading best practices in software engineering (version control, modularity, testing).
Architect- level cloud platform expertise (AWS, GCP, or Azure) with deep experience in multiple warehouses (BigQuery, Redshift, Synapse). Knowledge and implementation of security and compliance in cloud data environments (RBAC, data masking, encryption, GDPR/CCPA) and implementation of cost optimization strategies for cloud data platforms.
Leadership in defining and implementing DevOps & infrastructure- as- code strategies (GitLab/GitHub CI/CD, Terraform). Proven ability to design and implement comprehensive observability & monitoring solutions (logging, alerting, pipeline performance tracking).

Professional Skills

Hands- on experience with real- time analytics and low- latency serving layers (e.g., Apache Flink, Materialize, Rockset).
Practical experience with vector databases (Pinecone, Weaviate, ChromaDB) or semantic search in AI workflows.
Solid experience with machine learning pipelines and MLOps (MLflow, Vertex AI, SageMaker, Azure ML).
Demonstrated experience in leading large data teams, driving collaboration with business, analysts, and data scientists, and influencing technical direction.
Proven ability in data product design and domain- driven design in data platforms.

Quyền lợi

Tại sao bạn sẽ yêu thích làm việc tại đây

At Thoughtworks, we believe technology can be a powerful force for good. That&039;s why our purpose is to create an extraordinary impact on the world through our culture and technology excellence. This isn&039;t just a slogan, it&039;s the core principle guiding every decision we make.
We achieve this through a relentless pursuit of excellence and a culture built around these five core values:

Revolutionize the technology industry: We&039;re constantly pushing boundaries and innovating, aiming to make a lasting impact on the entire tech landscape.
Amplify positive social change and advocate for an equitable tech future: Technology should be a force for good. We actively promote positive social change and fight for an inclusive and equitable tech industry.
Foster a vibrant community of diverse and passionate technologists: Our diverse and passionate teams are our greatest asset. We foster a collaborative and inclusive environment where everyone feels valued and empowered to thrive.
Be an awesome partner for clients and their ambitious missions: We believe in building strong, collaborative relationships with our clients, helping them achieve their most ambitious goals.
Achieve enduring commercial success and sustained growth: A healthy and sustainable business allows us to continuously invest in our people, technology and social impact initiatives.

This commitment to our purpose is matched by a commitment to our people. As Thoughtworkers, you&039;ll be empowered to focus on what you do best – creating positive change through technology – because we offer a comprehensive benefits package designed to support your well- being and career development.
Here&039;s how we support our team:

Professional growth: Training allowances and personal development budgets.
Connectivity: Monthly communications allowance to stay connected at home and on the go.
Work- life balance: Flexible work arrangements on a hybrid work model, maternity/paternity leave, sabbatical leave.
Fun & camaraderie: We organize engaging social activities like running clubs, nature outings, annual events and monthly Town Halls to foster connections and a positive work environment.
Technology & equipment: Top- tier MacBooks and allowances for work- from- home setups.
Financial security: Competitive salaries, referral bonuses, and laptop buyback programs.
Health & wellness: Health insurance for employees and their immediate family, mental health support and annual check- ups.

Ready to break free from the ordinary and achieve the extraordinary? Explore our open roles!

Cập nhật gần nhất lúc: 2025-11-17 12:20:02

Xem thêm

Người tìm việc lưu ý:

Bạn đang xem tin Lead Data Engineer Databricks, Python, SQL - Mã tin đăng: 5402356. Mọi thông tin liên quan tới tin tuyển dụng này là do người đăng tin đăng tải và chịu trách nhiệm. Chúng tôi luôn cố gắng để có chất lượng thông tin tốt nhất, nhưng chúng tôi không đảm bảo và không chịu trách nhiệm về bất kỳ nội dung nào liên quan tới tin việc làm này. Nếu người tìm việc phát hiện có sai sót hay vấn đề gì xin hãy báo cáo cho chúng tôi