stakeholders. Team Player: Ability to work effectively in a collaborative team environment, as well as independently. Preferred Qualifications: Experience with big data technologies (e.g., Hadoop, Spark, Kafka). Experience with AWS and its data services (e.g. S3, Athena, AWS Glue). Familiarity with data warehousing solutions (e.g., Redshift, BigQuery more »
and development and optimization of ETL pipelines. Demonstrated understanding and experience using: Data lake Software and Big Data tools (i.e., Kafka, Spark, Databricks and Hadoop) Relational NoSQL and SQL databases including Postgres and Graph AWS cloud services (i.e., AWS DMS, AWS Lambda, AWS SQS, AWS Step Functions, AWS EventBridge more »
Redwood City, California, United States Hybrid / WFH Options
Karius
Stitch, Airbyte) for integrating internal and third-party data sources - Batch and Stream Processing : Experience in building scalable infrastructure for batch processing (e.g., Spark, Hadoop) and stream processing (e.g., Kafka, Kinesis) for large volumes of data • Developer Toolset : Proficiency in programming languages for data engineering (i.e. Python and SQL more »
Familiarity with event-driven architectures and messaging systems (Kafka, RabbitMQ). Experience with feature stores and model registries. Familiarity with big data technologies (Spark, Hadoop) Knowledge of monitoring and logging tools for machine learning models (Prometheus, Grafana, ELK stack). Significant experience with petabyte scale data sets. Significant experience more »
. Working knowledge with cloud computing platforms, (e.g., AWS, Azure, or Google Cloud, etc.). Familiarity with big data technologies, (e.g., Elastic Search, ApacheHadoop, Spark, Kafka, etc.). Experience deploying ML models in production environments using containerization technologies, (e.g., Docker, Kubernetes). Publications or contributions to ML-related more »
implementing cloud based data solutions using AWS services such as EC2, S3, EKS, Lambda, API Gateway, Glue and bid data tools like Spark, EMR, Hadoop etc. Hands on experience on data profiling, data modeling and data engineering using relational databases like Snowflake, Oracle, SQL Server; ETL tools like Informatica more »
computational modelling and deeply appreciate the challenges. You have written RESTful APIs and/or Webapps. You have implemented "Big Data" processing setups (e.g. Hadoop/Spark ecosystem, DataBricks, Cassandra etc.) You can code to an advanced level in Python. You are competent at coding in VBA. You have more »
Familiarity with event-driven architectures and messaging systems (Kafka, RabbitMQ). Experience with feature stores and model registries. Familiarity with big data technologies (Spark, Hadoop). Knowledge of monitoring and logging tools for machine learning models (Prometheus, Grafana, ELK stack). Significant experience with petabyte scale data sets. Significant more »
or Tableau. Experience with real-time analytics and event-driven architectures using tools such as Apache Kafka. Background in big data technologies such as Hadoop, HBase, or Cassandra. more »
required Desired Qualifications: Experience with .NET environments Hands-on experience with data formats including XML, PCAP, images, and media Hands-on experience working with Hadoop, Hive, Pig, Map Reduce, Spark, Rabbit MQ, Kafka, Flume DevOps experience building and deploying cloud infrastructure with technologies like Ansible, Chef, Puppet, etc. Experience more »
face of ambiguity Demonstrated understanding of high scale cloud architecture Good understanding of scalable distributed computing systems, software architecture, data structures and algorithms using Hadoop, Apache Spark, Apache Storm, etc. Proficient in network, distributed, asynchronous and concurrent programming Components of our system that are helpful to be familiar with more »
Basic knowledge of HTML and CSS for web interface integration. Experience with FastAPI or similar frameworks for building APIs. Understanding Big Data technologies (e.g., Hadoop, Spark, Databricks) for processing large datasets. Basic knowledge of machine learning principles and tools to support data-driven decision-making. Experience with Apache Airflow more »
Washington, Washington DC, United States Hybrid / WFH Options
Expression
degree is preferred.Technical Skills: Proficiency in cloud platforms such as AWS, Azure, or GCP. Hands-on experience with big data tools (e.g., Apache Spark, Hadoop, Kafka). Strong programming and scripting skills in Python, Java, or equivalent languages. Understanding of containerization (Docker, Kubernetes) and CI/CD pipelines.Clearance Requirements more »
Azure Storage, Azure SQL DB, HDInsight, Azure Databricks, Cosmos DB, ML Studio, Azure Functions, and more. Work with data integration tools such as Kafka, Hadoop, Hive, Spark, and others to manage data ingestion, processing, and curation. Contribute to cloud migration processes using tools like Azure Data Factory and Event more »
Annapolis, Maryland, United States Hybrid / WFH Options
Expression
SQL, NoSQL, Graph, etc.) and data architecture (Data Lake, Lakehouse) Knowledgeable in machine learning/AI methodologies Experience with one or more SQL-on-Hadoop technology (Spark SQL, Hive, Impala, Presto, etc.) Experience in short-release cycles and the full software lifecycle Experience with Agile development methodology (e.g., Scrum more »
AND QUALIFICATIONS Comfortable working with Linux systems on a daily basis. Deployment technologies such as: Docker, Kubernetes, Knative, Helm, Rancher. Cloud technologies such as: Hadoop, Kafka, HBase, Accumulo. CI/CD pipelines and tooling (Gitlab CI/CD, ArgoCD, CircleCI, Jenkins) AWS infrastructure and tooling. Geospatial data and analytics. more »
combination of education and experience REQUIRED QUALIFICATIONS Familiarization with Large Language Model (LLM) architectures and training procedures Big Data Frameworks such as Spark or Hadoop Data Science frameworks such as Keras, Tensorflow, or Theano Experience with GPU processing Proficient in Java and Python Must know to use an IDE more »
AND QUALIFICATIONS Comfortable working with Linux systems on a daily basis. Deployment technologies such as: Docker, Kubernetes, Knative, Helm, Rancher. Cloud technologies such as: Hadoop, Kafka, HBase, Accumulo. CI/CD pipelines and tooling (Gitlab CI/CD, ArgoCD, CircleCI, Jenkins) AWS infrastructure and tooling. Geospatial data and analytics. more »
support the successful ingestion, cleansing, transformation, loading, and display of significant amounts of data, while maintaining and improving the Client's data lake services, Hadoop environment, and Hadoop services. Responsibilities Designing and implementing large-scale ingest systems in a Big Data environment. Optimizing all stages of the data … sources. Working with Sponsor development teams to improve system and application performance. Providing support to maintain, optimize, troubleshoot, and configure data lake services, the Hadoop environment, and Hadoop services as needed. Organizing and maintaining documentation so others are able to understand and use it. Collaborating with teammates, other … Experience with service oriented design and architectures. Strong ability to manage competing priorities and communication to multiple stakeholders. Preferred Qualifications Hands-on experience with Hadoop and Hadoop services, particularly with large Hadoop clusters. Experience with Nifi flows and deployments. Company EEO Statement Accessibility/Accommodation: If because more »
building and optimising data pipelines and distributed data systems. - Strong expertise in cloud platforms (AWS, GCP, or Azure) and modern data technologies (Spark, Kafka, Hadoop, or similar). - Proficiency in programming languages such as Python, Scala, or Java. - Experience working on AI/ML-driven platforms, with knowledge of more »
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Bowerford Associates
optimal data extraction, transformation and loading using leading cloud technologies including Azure and AWS. Leverage Big Data Technologies - you will utilise tools such as Hadoop, Spark and Kafka to design and manage large-scale data processing systems. Extend and maintain the data Warehouse - you will be supporting and enhancing more »
similar. Experience with machine learning frameworks (e.g., TensorFlow, Scikit-learn). Strong knowledge of SQL and database management. Familiarity with big data technologies (e.g., Hadoop, Spark) and cloud platforms (e.g., AWS, Azure). Soft Skills: Excellent problem-solving and analytical skills. Strong communication and interpersonal skills. Ability to work more »
Java, Scala, Python, and Golang. Supporting experience to execute against database technologies such as PostgreSQL. Supporting experience to execute against cloud technologies such as Hadoop, Kafka, HBase, Accumulo. Experienced with full software lifecycle development. Preferred Skills and Qualifications CI/CD pipelines and tooling (Gitlab CI/CD, ArgoCD more »