framework. Experience with Data Build Tool (DBT) including building data models, tests, validation, and transformations. Thorough understanding of distributed file and table formats (i.e., Parquet, Delta, Iceberg, Hudi). Experience as a Data Engineer in your previous role. Preferred: Experience with IaC solutions using Terraform, Pulumi or similar tools. More ❯
data services (eg S3, Athena, Lambda) Familiarity with data lake architecture and modern ETL practices Experience working with structured and semi-structured data (eg, Parquet, JSON) Git version control and CI/CD familiarity Strong communication and collaboration skills Desirable: Experience with other AWS services like Redshift, EMR, Step More ❯
Cloud Data Lakehousing platforms (such as Apache Spark, Microsoft Fabric, Databricks, Snowflake) and associated industry standard/portable data formats (e.g., Delta Lake, Iceberg, Parquet, CSV, JSON, Avro, ORC, and XML) Experience in analysing/understanding business' enterprise data sources, data volumes, data velocity, data variety and data value More ❯
for automation) using one or more programming languages: Python, Java, or Scala. Knowledge of NoSQL and RDBMS databases. Experience in different data formats (Avro, Parquet). Experience in working with data visualization tools. Experience in GCP tools - Cloud Function, Dataflow, Dataproc, and Bigquery. Experience in data processing frameworks - Beam More ❯
with experience in Kafka Real-time messaging or Azure Stream Analytics/Event Hub. Spark processing and performance tuning. File formats partitioning for e.g. Parquet, JSON, XML, CSV. Azure DevOps, GitHub actions. Hands-on experience in at least one of Python with knowledge of the others. Experience in Data More ❯
Cardiff, South Glamorgan, Wales, United Kingdom Hybrid / WFH Options
True Worth Consulting Ltd
Architecture Experience Designing scalable data pipelines, automation, and orchestration. ?? Integration & API Knowledge Strong background in RESTful APIs and working with semi-structured data (JSON, Parquet, XML, YAML). ?? Proven Leadership Experience in managing and developing data engineering teams. Bonus Points for Experience in: ? Cloud & Data Platforms ( Snowflake, Power BI More ❯
Azure service bus, Function Apps, ADFs Possesses knowledge on data related technologies like - Data Warehouse, snowflake, ETL, Data pipelines, pyspark, delta tables, file formats - parquet, columnar Have a good understanding of SQL, stored procedures Be able to lead development and execution of performance and automation testing for large-scale More ❯
Azure service bus, Function Apps, ADFs Possesses knowledge on data related technologies like - Data Warehouse, snowflake, ETL, Data pipelines, pyspark, delta tables, file formats - parquet, columnar Have a good understanding of SQL, stored procedures Be able to lead development and execution of performance and automation testing for large-scale More ❯
Azure service bus, Function Apps, ADFs Possesses knowledge on data related technologies like - Data Warehouse, snowflake, ETL, Data pipelines, pyspark, delta tables, file formats - parquet, columnar Have a good understanding of SQL, stored procedures Be able to lead development and execution of performance and automation testing for large-scale More ❯
and automation. Proficiency in building and maintaining batch and streaming ETL/ELT pipelines at scale, employing tools such as Airflow, Fivetran, Kafka, Iceberg, Parquet, Spark, Glue for developing end-to-end data orchestration leveraging on AWS services to ingest, transform and process large volumes of structured and unstructured More ❯
Kubernetes for data services and task orchestration Airflow for job scheduling and tracking Circle CI for continuous deployment Databricks for our data lake platform Parquet and Delta file formats on S3 for data lake storage Postgres/aurora for our relational databases Spark for data processing dbt for data More ❯
Experience in data modelling and design patterns; in-depth knowledge of relational databases (PostgreSQL) and familiarity with data lakehouse formats (storage formats, e.g. ApacheParquet, Delta tables). Experience with Spark, Databricks, data lakes/lakehouses. Experience working with external data suppliers (defining requirements for suppliers, defining Service Level More ❯
Experience in data modelling and design patterns; in-depth knowledge of relational databases (PostgreSQL) and familiarity with data lakehouse formats (storage formats, e.g. ApacheParquet, Delta tables). Experience with Spark, Databricks, data lakes/lakehouses. Experience working with external data suppliers (defining requirements for suppliers, defining Service Level More ❯
Azure service bus, Function Apps, ADFs Possesses knowledge on data related technologies like - Data Warehouse, snowflake, ETL, Data pipelines, pyspark, delta tables, file formats - parquet, columnar Have a good understanding of SQL, stored procedures Be able to lead development and execution of performance and automation testing for large-scale More ❯
Services (S3, Lambda, Glue, API Gateway, Kinesis, IAM) Integrations (Email, SFTP, API, Webhooks, Streaming) Data Formats and Structures (XML, Excel, CSV, TSV, JSON, AVRO, Parquet) Qualifications Basic Requirements: Self-Starter: Ability to take initiative and work independently. Confident Speaker: Strong communication skills, comfortable presenting and discussing ideas. Technically Inclined More ❯
of data quality frameworks, validation techniques, and error handling to ensure high-integrity data processes. Experience working with structured and semi-structured data (JSON, Parquet, Avro, CSV, etc.) in migration contexts, and a deep understanding of legacy versus target data models. Ability to optimize data cleansing and transformation processes More ❯
experience with Matillion. Familiarity with a variety of Databases, incl. structured RDBMS. Experience in working with a variety of data formats, JSON, XML, CSV, Parquet, etc. Experience with building and maintaining data dictionaries/meta-data. Experience with Linux and cloud environments. Data Visualisation Technologies (e.g. Amazon QuickSight, Tableau More ❯
of data quality frameworks, validation techniques, and error handling to ensure high-integrity data processes. • ?Experience working with structured and semi-structured data (JSON, Parquet, Avro, CSV, etc.) in migration contexts, and a deep understanding of legacy versus target data models. • ?Ability to optimize data cleansing and transformation processes More ❯
Starburst and Athena Kafka and Kinesis DataHub ML Flow and Airflow Docker and Terraform Kafka, Spark, Kafka Streams and KSQL DBT AWS, S3, Iceberg, Parquet, Glue and EMR for our Data Lake Elasticsearch and DynamoDB More information: Enjoy fantastic perks like private healthcare & dental insurance, a generous work from More ❯
and experience working within a data driven organization Hands-on experience with architecting, implementing, and performance tuning of: Data Lake technologies (e.g. Delta Lake, Parquet, Spark, Databricks) API & Microservices Message queues, streaming technologies, and event driven architecture NoSQL databases and query languages Data domain and event data models Data More ❯
Managed Service for Apache Flink, Kafka Streams, Google Dataflow, Azure Stream Analytics, or Spark Structured Streaming Familiarity with data serialization formats (e.g., Avro, Protobuf, Parquet) Hands-on experience with modern data infrastructures, including data warehouses, data lakes, data lakehouses, and NoSQL databases Knowledge of security, compliance, and governance best More ❯
systems such as Kubernetes Experience working with data lakes; experience with Spark or Databricks Understanding of common data transformation and storage formats, e.g. ApacheParquet Familiar with version control systems such as Git and GitHub Experience with VCL and BCL would be a plus, but not required Non-Technical More ❯
Good understanding of cloud environments (ideally Azure), distributed computing and scaling workflows and pipelines Understanding of common data transformation and storage formats, e.g. ApacheParquet Awareness of data standards such as GA4GH ( ) and FAIR ( ). Exposure of genotyping and imputation is highly advantageous Benefits: Competitive base salary Generous Pension More ❯
and Lambda IAM - Experience handling IAM resource permissions Networking - fundamental understanding of VPC, subnet routing and gateways Storage - strong understanding of S3, EBS and Parquet Databases - RDS, DynamoDB Experience doing cost estimation in Cost Explorer and planning efficiency changes Terraform and containerisation experience Understanding of a broad range of More ❯
Functions, Datastore, and Cloud Spanner. Experience with message queues (e.g., RabbitMQ) and event-driven patterns. Hands-on experience with data serialization formats (e.g., Avro, Parquet, JSON) and schema registries. Strong understanding of DevOps and CI/CD pipelines for data streaming solutions. Familiarity with containerization and orchestration tools Excellent More ❯