Cloud SRE Engineer III

Job Locations US
Req ID
2025-8013
Category
Information Technology
Type
Full-Time Regular
Security Access Level
Access 1: US Citizenship Only (No Dual) / CFIUS Approval / Sole US Citizen (DMV & FBI Programs)
Work Schedule
Core Business Hours

Overview

IDEMIA is the global leader in identity and security. Our mission is to create a safe and simple future where identity verification is indisputable, and only you can assert your identity. We are a distributed company leveraging the latest technologies to deliver world-class products in the private and public sectors of finance, telecom, identity, security, retail, sports entertainment, commercial, government, and IoT. We use a variety of technologies and approaches to deliver quality product and services to government agencies and technology companies. IDEMIA is a made up of a group of 14,000 diverse people from different nationalities, speaking over 20 different languages. Together, our solutions impact the everyday lives of citizens and nations. In this ever-changing world, protecting your identity is paramount. Join the team that is ensuring one person - one identity.

Responsibilities

· Ensure high availability and reliability of cloud infrastructure through proactive monitoring and incident response

· Implement and maintain self-healing architecture to minimize service disruptions

· Design and implement automation for routine operational tasks and incident remediation

· Configure and optimize monitoring, logging, and alerting solutions using CloudWatch, etc.

· Analyze system performance metrics and implement improvements to meet SLAs and performance targets

· Conduct capacity planning and resource optimization to ensure scalability during peak loads

· Implement and maintain security controls using AWS Organizations, Control Tower, and AWS Config

· Participate in on-call rotation for production support with focus on rapid incident resolution

· Perform post-incident reviews and implement preventative measures

· Collaborate with development teams to improve application reliability and performance

· Implement FinOps practices to optimize cloud costs while maintaining operational excellence

· Create and maintain runbooks and technical documentation for operational procedures

Qualifications

Required Skills:

· Strong experience with AWS services including: · CloudWatch and CloudTrail for monitoring and audit

· Amazon VPCs

· IAM and AWS Organizations for security and access management

· EKS/Kubernetes for container orchestration

· Lambda and Aurora Serverless

· EC2/Auto Scaling for compute management

· Experience with monitoring tools and observability platforms

· Proficiency in Infrastructure as Code using Terraform

· Strong scripting skills in Python, Bash, or PowerShell for automation

· Experience with incident management and on-call responsibilities

· Knowledge of logging and monitoring solutions (ELK Stack, Splunk)

· Understanding of security best practices and compliance requirements

· Experience with cloud cost optimization and FinOps practices

Desired Skills:

· Experience with GO programming language

· Knowledge of AWS Step Functions and Event Bridge

· Experience with multi-account AWS architecture

· Experience with AWS GovCloud

· Knowledge of microservices architecture and distributed systems troubleshooting

· Experience with government cloud compliance requirements

· AWS certifications (Professional or Specialty level)

· Experience with chaos engineering and resilience testing

Required Education:

Bachelor's Degree in Computer Science, Engineering, or equivalent experience

Required Years of Experience:

6+ years of relevant experience in cloud operations and reliability engineering, with at least 3 years focused on AWS

Options

Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
Share on your newsfeed