Yêu cầu công việc
Technical Skills
AWS Expertise:
Understanding of AWS security best practices
Familiarity with AWS Well- Architected Framework
Strong proficiency with core AWS services (EC2, S3, RDS, VPC, IAM)
Experience with container services (ECS, EKS, ECR)
Knowledge of AWS monitoring and logging (CloudWatch, CloudTrail)
Experience with AWS CLI and SDKs
SRE & DevOps Tools:
Infrastructure as Code: Terraform, CloudFormation, or AWS CDK
Containerization: Docker, Kubernetes, Helm
Configuration management: Ansible, Chef, or Puppet
Version control: Git, GitHub/GitLab
Scripting languages: Python, Bash, or Go
CI/CD tools: Jenkins, GitLab CI, GitHub Actions
Monitoring & Observability:
APM tools: New Relic, Datadog, or AppDynamics
Distributed tracing: Jaeger, Zipkin, or AWS X- Ray
Alert management: PagerDuty, Opsgenie, or similar
Log management: ELK Stack, Splunk, or CloudWatch Logs
Prometheus, Grafana, or similar metrics platforms
Technical Fundamentals:
Understanding of distributed systems and microservices
Experience with performance tuning and optimization
Strong Linux/Unix system administration skills
Networking concepts: TCP/IP, DNS, Load Balancing, CDN
Knowledge of security principles and best practices
Database administration: PostgreSQL, MySQL, Redis
Soft skills:
Excellent written and verbal communication skills in both English and Vietnamese
Detail- oriented with strong documentation skills
Team player with collaborative mindset
Continuous learning mindset for new technologies
Strong problem- solving and troubleshooting abilities
Ability to work effectively under pressure during incidents
Proactive approach to identifying and solving problems
Experience
Proven track record of improving system reliability and uptime
Experience with 24/7 on- call responsibilities and incident management
2+ years of hands- on AWS experience in production environments
Experience maintaining high- traffic, high- availability systems
2- 5 years of experience in DevOps, SRE, or Infrastructure Engineering
Preferred Qualifications
Experience with chaos engineering and failure injection
AWS certifications (SysOps Administrator, DevOps Engineer, or Solutions Architect)
Contributions to open- source DevOps/SRE projects
Experience with AI/ML infrastructure and GPU workloads
Knowledge of SRE practices from Google&039;s SRE book
Experience with GCP and cloud migration projects
Experience with FinOps and cloud cost optimization
Knowledge of compliance frameworks (SOC2, ISO 27001)
Familiarity with automotive industry or vehicle inspection systems
Experience with serverless architectures (Lambda, API Gateway)