West London, London, United Kingdom Hybrid / WFH Options
Total Recruitment Group
Location: Remote-first (UK-based) Rate: Up to £550 p/d Contract: 6 - 12 months (Outside IR35) Tech Stack: Python, FastAPI, GCP, ApacheSpark, Apache Beam, Google Cloud Dataflow We're working with a forward-thinking consultancy that helps top companies build and scale high-performance … What Youll Be Doing: Building data pipelines and ETL workflows that process huge datasets Designing, optimizing, and maintaining high-throughput reporting solutions Working with ApacheSpark for large-scale data processing Using Apache Beam and Google Cloud Dataflow to manage complex data workflows Developing and improving backend … writing clean, efficient, and scalable code Experience with BigQuery, PostgreSQL, and Elasticsearch Hands-on experience with Google Cloud, Kubernetes, and Terraform Deep understanding of ApacheSpark for large-scale data processing Knowledge of Apache Beam & Google Cloud Dataflow for data pipeline orchestration A team-first mindset with more »
will be working on complex data problems in a challenging and fun environment using some of the latest Big Data open-source technologies like ApacheSpark as well as Amazon Web Service technologies including Elastic MapReduce Athena and Lambda to develop scalable data solutions. Adhering to Company Policies … we're looking for: The ability to problem-solve. Knowledge of AWS or equivalent cloud technologies. Knowledge of Serverless technologies frameworks and best practices. ApacheSpark (Scala or Pyspark) Experience using AWS CloudFormation or Terraform for infrastructure automation. Knowledge of Scala or 00 language such as Java or … skills positive attitude willing to help other members of the team. Experience debugging and dealing with failures on business-critical systems. Preferred: Exposure to ApacheSparkApache Trino or another big data processing system. Knowledge of streaming data principles and best practices. Understanding of database technologies and more »
performance and responsiveness. Stay Up to Date with Technology: Keep yourself and the team updated on the latest Python technologies, frameworks, and tools like ApacheSpark , Databricks , Apache Pulsar , Apache Airflow , Temporal , and Apache Flink , sharing knowledge and suggesting improvements. Documentation: Contribute to clear and … or Azure . DevOps Tools: Familiarity with containerization ( Docker ) and infrastructure automation tools like Terraform or Ansible . Real-time Data Streaming: Experience with Apache Pulsar or similar systems for real-time messaging and stream processing is a plus. Data Engineering: Experience with ApacheSpark , Databricks , or … similar big data platforms for processing large datasets, building data pipelines, and machine learning workflows. Workflow Orchestration: Familiarity with tools like Apache Airflow or Temporal for managing workflows and scheduling jobs in distributed systems. Stream Processing: Experience with Apache Flink or other stream processing frameworks is a plus. more »
performance and responsiveness. Stay Up to Date with Technology: Keep yourself and the team updated on the latest Python technologies, frameworks, and tools like ApacheSpark , Databricks , Apache Pulsar , Apache Airflow , Temporal , and Apache Flink , sharing knowledge and suggesting improvements. Documentation: Contribute to clear and … or Azure . DevOps Tools: Familiarity with containerization ( Docker ) and infrastructure automation tools like Terraform or Ansible . Real-time Data Streaming: Experience with Apache Pulsar or similar systems for real-time messaging and stream processing is a plus. Data Engineering: Experience with ApacheSpark , Databricks , or … similar big data platforms for processing large datasets, building data pipelines, and machine learning workflows. Workflow Orchestration: Familiarity with tools like Apache Airflow or Temporal for managing workflows and scheduling jobs in distributed systems. Stream Processing: Experience with Apache Flink or other stream processing frameworks is a plus. more »
Requires minimum of 2+ yrs experience on Hadoop/ML/Large Dataset handling with DB/DW experience and have advanced experience in Apache Hadoop, ApacheSpark, Apache Hive, and Presto or ETL skills with working experience on Databases Oracle/SQL Server/PostgreSQL … Network troubleshooting. Basic understanding of Machine Learning, troubleshooting Kerberos Authentication problems, Java and Python and shell scripting. Expert experience in the Hadoop Ecosystem including ApacheSpark, Presto, Data Lake architecture and administration. Prior work experience with AWS services - any or all of EMR, Glue, SageMaker and excellent knowledge more »
Software development background, with major experience in: Back-end data processing Data lakehouse Hands-on experience of Big Data open-source technologies such as: Apache Airflow Apache Kafka Apache Pekko ApacheSpark & Spark Structured Streaming Delta Lake AWS Athena Trino MongoDB AWS S3 MinIO more »
Requires minimum of 2+ yrs experience on Hadoop/ML/Large Dataset handling with DB/DW experience and have advanced experience in Apache Hadoop, ApacheSpark, Apache Hive, and Presto or ETL skills with working experience on Databases Oracle/SQL Server/PostgreSQL … Basic understanding of Machine Learning, troubleshooting Kerberos Authentication problems, Java and Python and shell scripting PREFERRED QUALIFICATIONS - Expert experience in the Hadoop Ecosystem including ApacheSpark, Presto, Data Lake architecture and administration - Prior work experience with AWS services - any or all of EMR,Glue,SageMaker and excellent knowledge more »
Manchester, North West, United Kingdom Hybrid / WFH Options
Senitor Associates Limited
data platform, presenting knowledge to others Your required skills: Knowledge of AWS or equivalent cloud technologies Knowledge of Serverless technologies, frameworks and best practices Apachespark (Scala or Pyspark) Experience using infrastructure automation tools, such as AWS CloudFormation or Terraform Knowledge of Scala or OO language, such as … attitude and willingness to help others in the team Experience debugging and dealing with failures on business critical systems Your desired skills: Exposure to ApacheSpark, Apache Trio or other big data processing system Experience streaming data principles and best practices Understanding database technologies and standards Exposure more »
will be working on complex data problems in a challenging and fun environment, using some of the latest Big Data open-source technologies like ApacheSpark, as well as Amazon Web Service technologies including Elastic MapReduce, Athena and Lambda to develop scalable data solutions. Requirements Key Responsibilities: Adhering … positive attitude, willing to help other members of the team. Experience debugging and dealing with failures on business-critical systems. Preferred Attributes: Exposure to ApacheSpark, Apache Trino, or another big data processing system. Knowledge of streaming data principles and best practices. Understanding of database technologies and more »
Our team values continuous learning, knowledge sharing, and creating inclusive solutions that make a difference. Key Responsibilities Support customers with big data services including ApacheSpark, Hive, Presto, and other Hadoop ecosystem components Develop and share technical solutions through various communication channels Contribute to improving support processes and … work week schedule, which may include weekends on rotation. Minimum Requirements Good depth of understanding in Hadoop Administration, support and troubleshooting (Any two applications: ApacheSpark, Apache Hive, Presto, Map-Reduce, Zookeeper, HBASE, HDFS and Pig.) Good understanding of Linux and Networking concepts Intermediate programming/scripting more »
Responsibilities: Design and develop scalable, high-performance data processing applications using Scala. Build and optimize ETL pipelines for handling large-scale datasets. Work with ApacheSpark, Kafka, Flink, and other distributed data frameworks to process massive amounts of data. Develop and maintain data lake and warehouse solutions using … technologies like Databricks Delta Lake, Apache Iceberg, or Apache Hudi. Write clean, maintainable, and well-documented code. Optimize query performance, indexing strategies, and storage formats (JSON, Parquet, Avro, ORC). Implement real-time streaming solutions and event-driven architectures. Collaborate with data scientists, analysts, and DevOps engineers to … Master’s degree in Computer Science Engineering, or a related field. 10+ years of experience in software development with Scala Hands-on experience with ApacheSpark (batch & streaming) using Scala. Experience in developing and maintaining data lake and warehouses using technologies like Databricks Delta Lake, Apache Iceberg more »
platforms, and third-party systems). ETL (extract, transform, and load) data into appropriate data structures and formats. Leverage tools and technologies such as ApacheSpark, Hadoop, cloud-based solutions including data virtualisation and data semantic layers. Implement processes and checks to ensure data accuracy, consistency, and compliance … relational databases (MySQL, PostgreSQL), experience with data warehousing concepts and tools (Snowflake, Redshift), data lakes and data lakehouses. Familiarity with distributed computing frameworks like Apache Hadoop, ApacheSpark, and NoSQL databases (such as MongoDB, Cassandra). An understanding of data modelling techniques (relational, dimensional, and NoSQL) and … proficiency in designing efficient and scalable database schemas. Experience with workflow orchestration tools (Apache Airflow, Prefect) and data pipeline frameworks (Apache Kafka, Talend). Familiarity with cloud platforms (AWS, GCP or Azure) and their data services (AWS Glue, GCP Dataflow) for building scalable cost-effective data solutions. Knowledge more »
for experimentation, analytics event logging, batch data platform, real-time infrastructure, and analytics engineering. Technical Oversight: Oversee and optimise our data infrastructure, including Databricks (Spark/Spark SQL), Kafka/Avro, and dbt-driven data pipelines. Drive initiatives to improve data quality, usability, and the self-service experience … Experience in data engineering, within a leadership or managerial role overseeing multiple teams or squads. Strong expertise in data infrastructure, including experience with Databricks, ApacheSpark, Spark SQL, Kafka, Avro, and dbt. Demonstrated success in managing data platforms at scale, including both batch processing and real-time more »
London. Hybrid - Contract. Inside IR35 - Umbrella Key Responsibilities: Develop, maintain, and optimize PySpark data processing pipelines in a fast-paced investment banking environment, dataframes, spark streaming, Python, Spark, PySpark xp vital CICD (Jenkins, Git) Collaborate with cross-functional teams, including data engineers and analysts, to implement data-driven … solutions tailored for investment banking needs. Leverage PySpark and ApacheSpark to efficiently handle large datasets and improve processing efficiency. Qualifications: Proven experience in data processing and automation. Strong proficiency in PySpark and ApacheSpark for data pipeline development. Expertise in CICD pipelines, Jenkins, Git Excellent more »
learning libraries in one or more programming languages. Keen interest in some of the following areas: Big Data Analytics (e.g. Google BigQuery/BigTable, ApacheSpark), Parallel Computing (e.g. ApacheSpark, Kubernetes, Databricks), Cloud Engineering (AWS, GCP, Azure), Spatial Query Optimisation, Data Storytelling with (Jupyter) Notebooks more »
Working knowledge of two or more common Cloud ecosystems (AWS, Azure, GCP) with expertise in at least one. Deep experience with distributed computing with ApacheSpark and knowledge of Spark runtime internals. Familiarity with CI/CD for production deployments. Working knowledge of MLOps. Design and deployment … data, analytics, and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake, and MLflow. Benefits At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of more »
NoSQL databases and cloud data services (AWS) to manage and process large datasets. Optimize data warehousing, modeling, and indexing for performance and scalability. Leverage ApacheSpark, Airflow, Kafka, or similar technologies to manage and automate workflows. Data Security & Quality Control Ensure data security, compliance, and integrity, implementing best … SQL/NoSQL databases and cloud data platforms (AWS) Understanding of data modelling, data warehousing, and database optimisation. Experience with distributed data processing tools (ApacheSpark, Airflow, Kafka, or similar). Proactive approach to identifying and solving data quality issues. Strong project management skills, coordinating with cross-functional more »
Product Specialist - Lakeflow Pipelines & Connect (Sr. Specialist Solutions Architect) you will be a deep technical expert in Lakeflow Pipelines (aka Delta Live Tables and Spark Structured Streaming) and Lakeflow Connect (data ingestion) and in how customers can be successful with developing these use-cases in the Lakehouse paradigm. You … Certification and/or demonstrated competence in the Azure, AWS or GCP ecosystem. Demonstrated competence in the Lakehouse architecture including hands-on experience with ApacheSpark, Python and SQL. Excellent communication skills; both written and verbal. Experience in pre-sales selling highly desired. About Databricks Databricks is the … data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake and MLflow. To learn more, follow Databricks on Twitter, LinkedIn and Facebook. Benefits At Databricks, we strive to provide comprehensive more »
Manchester, North West, United Kingdom Hybrid / WFH Options
INFUSED SOLUTIONS LIMITED
a Senior Data & Integration Architect to drive scalable, high-performance data solutions , connecting legacy systems with modern cloud architectures . You'll work with ApacheSpark, Databricks, Kafka, Airflow, and Azure , leading data strategy, automation, and integration across the organisation. Your Role Develop and execute a data architecture … and drive innovation. What You Bring Proven expertise in data architecture, ETL, and cloud-based solutions . Strong skills in SQL Server, Azure, Databricks, ApacheSpark, Kafka, and Python . Deep understanding of data security, governance, and compliance frameworks . Experience with CI/CD pipelines, Agile methodologies more »
Java Spark Engineer Tier 1 Investment Bank 10 months contract Up to £500pd inside Ir35 via Umbrella 3x per week in Canary Wharf IMMEDIATE STARTERS ONLY We are currently seeking an experienced Java Spark Developer with strong expertise in big data processing , Core Java , and ApacheSpark … and requires candidates with a solid background in financial systems, market risk, and large-scale distributed computing. Key Responsibilities: Develop and optimize scalable Java Spark-based data pipelines for processing and analyzing large-scale financial data. Design and implement distributed computing solutions for risk modeling, pricing, and regulatory compliance. … Ensure efficient data storage and retrieval using Big Data technologies. Implement best practices for Spark performance tuning (partitioning, caching, memory management). Work with batch processing frameworks for market risk analytics. Maintain high code quality through testing, CI/CD pipelines, and version control (Git, Jenkins). Key Requirements more »
Systems: Strong understanding of distributed systems, microservices architectures, and the challenges of building high-throughput, low-latency systems. Hands-on experience with tools like Apache Kafka , RabbitMQ , Apache Pulsar , and other messaging systems for real-time data streaming. DevOps and Infrastructure Automation: Expertise in DevOps principles, infrastructure-as … maintaining, and optimizing CI/CD pipelines. Big Data & Data Engineering: Strong background in processing large datasets and building data pipelines using platforms like ApacheSpark , Databricks , Apache Flink , or similar big data tools. Experience with batch and stream processing. Security: In-depth knowledge of security practices more »
Systems: Strong understanding of distributed systems, microservices architectures, and the challenges of building high-throughput, low-latency systems. Hands-on experience with tools like Apache Kafka , RabbitMQ , Apache Pulsar , and other messaging systems for real-time data streaming. DevOps and Infrastructure Automation: Expertise in DevOps principles, infrastructure-as … maintaining, and optimizing CI/CD pipelines. Big Data & Data Engineering: Strong background in processing large datasets and building data pipelines using platforms like ApacheSpark , Databricks , Apache Flink , or similar big data tools. Experience with batch and stream processing. Security: In-depth knowledge of security practices more »
Birmingham, West Midlands, West Midlands (County), United Kingdom
Spectrum IT Recruitment
database design, OLTP, OLAP, and Data Warehousing concepts Experience with MS SQL Server and/or PostgreSQL Desirable: exposure to NoSQL, big data technologies (ApacheSpark, Apache Iceberg), and Power BI Knowledge of information security, data security, and data governance If you're passionate about data, problem more »
to join our team. As an ETL Developer, you will be responsible for the design, development, and implementation of data processing pipelines using Python, Spark, and other related technologies to handle large-scale data efficiently. You will also be involved in ensuring the integration of data into cloud environments … as Azure, alongside basic DevOps tasks and RDBMS fundamentals. Responsibilities: Develop and maintain ETL pipelines using Python for data extraction, transformation, and loading. Utilize ApacheSpark for big data processing to handle large datasets and optimize performance. Work with cloud technologies, particularly Azure , to deploy and integrate data … hands-on experience in key libraries (Pandas, NumPy, etc.) and a deep understanding of Python programming concepts. Solid experience in Big Data Processing using ApacheSpark for large-scale data handling. Basic DevOps knowledge and familiarity with CI/CD pipelines for automating workflows. Understanding of Azure Fundamentals more »
Java Spark Developer (Contract to Perm) Location: Canary Wharf, London - 3 days onsite Contract Type: Contract to Perm (inside IR35 via umbrella) Are you a skilled Java Spark Developer with a passion for big data processing? Our client, a leading player in the finance domain, is looking for … their team in Canary Wharf, London. This is an exciting opportunity to work in a dynamic environment where your expertise in Core Java and ApacheSpark will make a significant impact. Key Responsibilities: Develop and optimise scalable Java Spark-based data pipelines for processing and analysing large … distributed computing solutions for risk modelling, pricing, and regulatory compliance. Ensure efficient data storage and retrieval using Big Data technologies. Implement best practices for Spark performance tuning, including partitioning, caching, and memory management. Maintain high code quality through testing, CI/CD pipelines, and version control (Git, Jenkins). more »