Analytics, DAMA Certified Data Management Professional). Familiarity with big data tools and platforms (e.g., Hadoop, Spark, Kafka). Knowledge of real-time data streaming tools (e.g., Apache Kafka, Flink). Experience with containerization technologies (e.g., Docker, Kubernetes). Understanding of machine learning pipelines and data science workflows. Knowledge of enterprise architecture frameworks (e.g., TOGAF). Key Competencies: Analytical More ❯
data models supporting analytics, reporting, and ML workflows, aligning with established architecture and performance standards. Data Pipeline Development Hands-on experience with distributed processing tools (Apache Kafka, Airflow, Spark, Flink, NiFi). Skilled in building and orchestrating batch and real-time pipelines on cloud platforms (AWS Glue, GCP Dataflow, Azure Data Factory). Deep understanding of incremental processing, idempotency More ❯
maintaining data pipelines. Proficiency in JVM-based languages (Java, Kotlin), ideally combined with Python and experience in Spring Boot Solid understanding of data engineering tools and frameworks, like Spark, Flink, Kafka, dbt, Trino, and Airflow. Hands-on experience with cloud environments (AWS, GCP, or Azure), infrastructure-as-code practices, and ideally container orchestration with Kubernetes. Familiarity with SQL and More ❯
algorithms. Modern data technology: Exposure to agentic AI patterns, knowledge base systems, and expert systems is a plus. Experience with real-time streaming processing frameworks like Apache Kafka, ApacheFlink, Apache Beam, or pub/sub real-time messaging systems is a plus. Advance database and data warehouse expertise: Familiar with diverse database technologies in addition to relational, such More ❯
platforms (AWS, GCP, Azure) and container orchestration technologies (Kubernetes, Docker) at enterprise scale Proven track record leading and scaling data pipelines using technologies like Apache Kafka, Apache Spark, ApacheFlink, or similar streaming frameworks Deep expertise in database technologies, including both SQL (PostgreSQL, MySQL) and NoSQL (MongoDB, Cassandra, Redis) systems with experience in data modeling and optimization Advanced experience More ❯
with cloud platforms (AWS, GCP, Azure) and container orchestration technologies (Kubernetes, Docker) Proven track record in building and scaling data pipelines using technologies like Apache Kafka, Apache Spark, ApacheFlink, or similar streaming frameworks Strong background in database technologies, including both SQL (PostgreSQL, MySQL) and NoSQL (MongoDB, Cassandra, Redis) systems Hands-on experience with machine learning frameworks (TensorFlow, PyTorch More ❯
Cloud environments 1+ year experience using scripting languages such as R, Python, or Javascript 1+ years experience using Big Data technologies and tools (e.g. Spark, Hadoop, Hive, Cassandra, Druid, Flink, Drill, Trino, NoSQL) Experience with a variety of open source and commercial ETL tools for preparing and loading data for AI/ML analytics Ability to manipulate raw data More ❯
SQL Experience working in a cloud environment such as Google Cloud Platform or AWS Hands on programming experience of the following (or similar) technologies: Kubernetes, Docker Apache Beam, ApacheFlink, Apache Spark Google BigQuery, Snowflake Google BigTable Google Pub/Sub, Kafka Apache Airflow Experience implementing observability around data pipelines using SRE best practices. Experience in processing structured and More ❯
and fun. Knowledge: Programming Languages, e.g. Python, Java SQL Databases, e.g. MySQL, PostgreSQL, SQL Server, MariaDB NoSQL Databases, e.g. MongoDB, Cassandra Data Warehousing Solutions, e.g. Snowflake Data Processing, e.g. Flink Stream Process, e.g. Kafka, MSK Data modeling, ETL/ELT processes, and data quality frameworks. Other data engineering tools, e.g. Kafka Connect, DMS, Talend, Prefect Cloud Platforms, e.g. AWS More ❯
building complex, highly scalable, reliable data pipelines, using the Big Data ecosystem (Spark, Iceberg, Glue Catalog, Kafka or equivalents) Cloud native application and cloud services architecture on AWS (MSK, Flink, Bedrock, Lambdas, Spark/EMR) API and microservices architecture Streaming and event processing architectures and platforms (e.g., Kafka, FlinkSQL, Python) Structured and unstructured databases and usage patterns High-speed More ❯
e.g., Hadoop, Spark). Strong knowledge of data workflow solutions like Azure Data Factory, Apache NiFi, Apache Airflow etc Good knowledge of stream and batch processing solutions like ApacheFlink, Apache Kafka Good knowledge of log management, monitoring, and analytics solutions like Splunk, Elastic Stack, New Relic etc Given that this is just a short snapshot of the role More ❯
Luton, England, United Kingdom Hybrid/Remote Options
easyJet
CloudFormation. Understanding of ML development workflow and knowledge of when and how to use dedicated hardware. Significant experience with Apache Spark or any other distributed data programming frameworks (e.g. Flink, Hadoop, Beam) Familiarity with Databricks as a data and AI platform or the Lakehouse Architecture. Experience with data quality and/or and data lineage frameworks like Great Expectations More ❯
focused on large-scale data systems. Strong programming skills in Python, SQL, and familiarity with Java/Scala a plus. Hands-on experience with big data frameworks (e.g., Spark, Flink, Hadoop) and workflow orchestration (Airflow, Prefect, Dagster). Proven experience with cloud-based data platforms (AWS, GCP, Azure) and data lake/warehouse technologies (Snowflake, BigQuery, Redshift, Delta Lake More ❯
in data engineering, data architecture, or a similar role, with at least 3 years in a lead capacity. Proficient in SQL, Python, and big data processing frameworks (e.g., Spark, Flink). Strong experience with cloud platforms (AWS, Azure, GCP) and related data services. Hands-on experience with data warehousing tools (e.g., Snowflake, Redshift, BigQuery), Databricks running on multiple cloud More ❯
AWS, GCP, or Azure and their data services Problem-solving mindset with attention to scalability and performance Preferred Experience with real-time data streaming such as Kafka, Spark, or Flink Knowledge of data governance, security, and compliance practices Familiarity with BI and visualization tools such as Tableau, Power BI, or Looker Exposure to CI/CD and DevOps practices More ❯
different approaches. Strong skill in Java/Scala or Python Familiarity with cloud-native technologies used for scalable data processing. Experience with one or more data processing technologies (e.g. Flink, Spark, Polars, Dask, etc.) Experience with multiple data storage technologies (e.g. S3, RDBMS, NoSQL, Delta/Iceberg, Cassandra, Clickhouse, Kafka, etc.) and knowledge of their associated trade-offs. Experience More ❯
technologies (e.g., Hadoop, Spark). Strong knowledge of data workflow solutions like Azure Data Factory, Apache NiFi, Apache Airflow Good knowledge of stream and batch processing solutions like ApacheFlink, Apache Kafka Good knowledge of log management, monitoring, and analytics solutions like Splunk, Elastic Stack, New Relic Note: The following line contains removed formatting for safety. Given that this More ❯
custom ML feature pipelines. Have experience in observability and monitoring (Prometheus, Grafana, ELK/EFK). Are familiar with open-source data/streaming frameworks such as Apache Spark, Flink, Delta Lake, Kafka, or Airflow. Have deep Python skills or are curious about Rust. Are comfortable creating technical content, workshops, or presenting at meetups/conferences. Have experience deploying More ❯
and guide implementation teams • Deep understanding of Kafka internals, KRaft architecture, and Confluent components • Experience with Confluent Cloud, Stream Governance, Data Lineage, and RBAC • Expertise in stream processing (ApacheFlink, Kafka Streams, ksqlDB) and event-driven architecture • Strong proficiency in Java, Python, or Scala • Proven ability to integrate Kafka with enterprise systems (databases, APIs, microservices) • Hands-on experience with More ❯
and guide implementation teams • Deep understanding of Kafka internals, KRaft architecture, and Confluent components • Experience with Confluent Cloud, Stream Governance, Data Lineage, and RBAC • Expertise in stream processing (ApacheFlink, Kafka Streams, ksqlDB) and event-driven architecture • Strong proficiency in Java, Python, or Scala • Proven ability to integrate Kafka with enterprise systems (databases, APIs, microservices) • Hands-on experience with More ❯
Kinesis) Knowledge of IaC (Terraform, CloudFormation) and containerisation (Docker, Kubernetes) Nice to have: Experience with dbt, feature stores, or ML pipeline tooling Familiarity with Elasticsearch or real-time analytics (Flink, Materialize) Exposure to eCommerce, marketplace, or transactional environments More ❯
Kinesis) Knowledge of IaC (Terraform, CloudFormation) and containerisation (Docker, Kubernetes) Nice to have: Experience with dbt, feature stores, or ML pipeline tooling Familiarity with Elasticsearch or real-time analytics (Flink, Materialize) Exposure to eCommerce, marketplace, or transactional environments More ❯
Strong experience working with SQL and databases/engines such as MySQL, PostgreSQL, SQL Server, Snowflake, Redshift, Presto, etc Experience building ETL and stream processing pipelines using Kafka, Spark, Flink, Airflow/Prefect, etc. Familiarity with data science stack: e.g. Juypter, Pandas, Scikit-learn, Dask, Pytorch, MLFlow, Kubeflow, etc. Strong experience with using AWS/Google Cloud Platform (S3S More ❯
Sheffield, South Yorkshire, England, United Kingdom Hybrid/Remote Options
Vivedia Ltd
pipelines , data modeling , and data warehousing . Experience with cloud platforms (AWS, Azure, GCP) and tools like Snowflake, Databricks, or BigQuery . Familiarity with streaming technologies (Kafka, Spark Streaming, Flink) is a plus. Tools & Frameworks: Airflow, dbt, Prefect, CI/CD pipelines, Terraform. Mindset: Curious, data-obsessed, and driven to create meaningful business impact. Soft Skills: Excellent communication and More ❯
similar language Experience with SQL and data modeling concepts Experience with cloud-based data warehousing solutions such as Redshift, BigQuery, or similar Experience with ETL tools such as Spark, Flink, Databricks, Snowflake, etc. Experience with messaging systems such as RabbitMQ, Kafka, etc. Knowledge of the underlying cloud infrastructure on how the various data pipeline component fit together Excellent problem More ❯