Data Engineer
Join our R&D data team as a Data Engineer: build robust pipelines on AWS, Snowflake, and Python — empower research with clean, scalable data tooling.
About the Role
We are looking for a skilled Data Engineer to join our growing R&D team within Data Science. In this role, you will be reporting to the Head of AI & DataOps and you will collaborate with the engineering team to improve the robustness and scalability of our data infrastructure. You will also design and build data ingestion pipelines that reliably bring new datasets into our platform using AWS cloud services and Snowflake and also support enhancements to our databases,
This is a hands-on technical role suited for someone who is passionate about writing excellent Python code, working with data, and building internal tooling to empower research and development efforts.
Key Responsibilities
· Build and maintain data ingestion pipelines using AWS services (S3, Lambda, SQS, Step Functions).
· Design and implement ETL/ELT workflows using Snowflake for data warehousing, transformation, and analytics.
· Manage and monitor cloud infrastructure for data pipelines (S3 buckets, IAM policies, CloudWatch alerting).
· Experience with event-driven architectures (SQS triggers, S3 notifications, webhook integrations).
· Familiarity with Infrastructure as Code (Terraform or CloudFormation).
· Support the broader data science team by improving tooling, workflows, and data infrastructure.
· Contribute to DevOps-related initiatives such as deployment automation, environment management, CI/CD pipelines, and monitoring.
· Maintain high standards of code quality through testing, documentation, and code reviews.
What We're Looking For
· 2+ years of professional experience in software engineering, data engineering, or a related field.
· Hands-on experience with AWS core services (S3, Lambda, SQS, IAM, CloudWatch).
· Strong proficiency in Python, with a focus on building clean, reusable, and scalable code.
· Hands-on experience with SQL and database management (e.g., PostgreSQL, MySQL, or similar) is a bonus.
· Experience building data pipelines or ingestion systems that handle structured and semi-structured data (JSON, Parquet, CSV).
· Familiarity with a cloud data warehouse (Snowflake, Redshift, or BigQuery).
· Familiarity with data science concepts and workflows (you don't need to be a full-fledged data scientist).
· Exposure to DevOps practices and tools (e.g., Docker, CI/CD pipelines).
· A collaborative mindset and strong communication skills.
· A proactive, ownership-oriented attitude toward problem-solving.
Nice to Have
· Develop and optimise database structures (SQL), and manage triggers, stored procedures, and scheduled events to ensure data reliability and operational automation.
· Experience with Snowflake (stages, pipes, tasks, streams) for automated data ingestion.
· Knowledge of data quality and observability practices (schema validation, data contracts, monitoring).
· Experience with AWS Batch, Apache Airflow, Prefect, or similar orchestration tools.
· Experience working with ORMs like SQLAlchemy.
· Experience in building internal Python packages or CLI tools.
· Knowledge of writing database triggers, procedures, and events.
· Knowledge of performance optimisation for large datasets.
· Security best practices in code and data workflows.
- Department
- Technology
- Locations
- Indore, London, Manchester, Dubai, Remote
- Remote status
- Fully Remote
About Oxford DataPlan
The Home of Alternative Data. We use alternative data and data science to deliver near real-time estimates of revenue and other key performance indicators for 200+ publicly listed companies globally — updating daily.