IDEMIA is the global leader in identity and security. Our mission is to create a safe and simple future where identity verification is indisputable, and only you can assert your identity. We are a distributed company leveraging the latest technologies to deliver world-class products in the private and public sectors of finance, telecom, identity, security, retail, sports entertainment, commercial, government, and IoT. We use a variety of technologies and approaches to deliver quality product and services to government agencies and technology companies. IDEMIA is a made up of a group of 14,000 diverse people from different nationalities, speaking over 20 different languages. Together, our solutions impact the everyday lives of citizens and nations. In this ever-changing world, protecting your identity is paramount. Join the team that is ensuring one person - one identity.
· Ensure high availability and reliability of cloud infrastructure through proactive monitoring and incident response
· Implement and maintain self-healing architecture to minimize service disruptions
· Design and implement automation for routine operational tasks and incident remediation
· Configure and optimize monitoring, logging, and alerting solutions using CloudWatch, etc.
· Analyze system performance metrics and implement improvements to meet SLAs and performance targets
· Conduct capacity planning and resource optimization to ensure scalability during peak loads
· Implement and maintain security controls using AWS Organizations, Control Tower, and AWS Config
· Participate in on-call rotation for production support with focus on rapid incident resolution
· Perform post-incident reviews and implement preventative measures
· Collaborate with development teams to improve application reliability and performance
· Implement FinOps practices to optimize cloud costs while maintaining operational excellence
· Create and maintain runbooks and technical documentation for operational procedures
· Strong experience with AWS services including: · CloudWatch and CloudTrail for monitoring and audit
· Amazon VPCs
· IAM and AWS Organizations for security and access management
· EKS/Kubernetes for container orchestration
· Lambda and Aurora Serverless
· EC2/Auto Scaling for compute management
· Experience with monitoring tools and observability platforms
· Proficiency in Infrastructure as Code using Terraform
· Strong scripting skills in Python, Bash, or PowerShell for automation
· Experience with incident management and on-call responsibilities
· Knowledge of logging and monitoring solutions (ELK Stack, Splunk)
· Understanding of security best practices and compliance requirements
· Experience with cloud cost optimization and FinOps practices
· Experience with GO programming language
· Knowledge of AWS Step Functions and Event Bridge
· Experience with multi-account AWS architecture
· Experience with AWS GovCloud
· Knowledge of microservices architecture and distributed systems troubleshooting
· Experience with government cloud compliance requirements
· AWS certifications (Professional or Specialty level)
· Experience with chaos engineering and resilience testing
Bachelor's Degree in Computer Science, Engineering, or equivalent experience
6+ years of relevant experience in cloud operations and reliability engineering, with at least 3 years focused on AWS
Software Powered by iCIMS
www.icims.com