Tổng quan

Mức lương:  Thoả thuận

Loại công việc:  Toàn thời gian

Kinh nghiệm: 2 năm kinh nghiệm

Số lượng tuyển: 1

Hạn nộp hồ sơ: 2025-12-05

Ngày đăng: 2025-11-07 22:04

Danh mục:  Công nghệ thông tin

Mô tả công việc

  • Manage and improve system reliability through SLO, SLI, and SLA practices.
  • Design and implement observability systems (metrics, logs, tracing, alerting) using tools like Prometheus, Grafana, ELK, etc.
  • Build and automate CI/CD pipelines and Infrastructure as Code (IaC) using tools such as Terraform, Ansible, Pulumi, Helm.
  • Collaborate in the analysis, design, and deployment of systems and processes to ensure reliability, observability, and scalability.
  • Optimize system cost, performance (latency, throughput), and security.
  • Operate and optimize Kubernetes clusters (EKS); strong knowledge of Docker, Kubernetes, Helm is required.
  • Develop internal tools to automate workflows and support other teams.
  • Participate in incident response, root cause analysis, postmortem reviews, and improve incident handling processes.
  • Support and coordinate with NOC (Network Operation Center) teams.
  • Be part of the on-call rotation when needed.

Yêu cầu

  • 2–5 years of experience in SRE / DevOps / Platform Engineering.
  • Hands-on experience with monitoring and alerting systems (Prometheus, Grafana, ELK, Loki, etc.).
  • Proficient in CI/CD tools (GitLab CI, Jenkins) and familiar with Git workflows.
  • Experience in deploying and managing Kubernetes (EKS is a plus).
  • Understanding of gRPC, and capable of optimizing nginx connections and network stacks.
  • Strong Linux background with deep knowledge of kernel, network stack, file system, and processes.
  • Excellent troubleshooting skills — able to analyze issues from OS to application layer.
  • System-thinking mindset, focus on automation, and ability to mentor teammates.
  • Proactive, responsible, and able to work under pressure during incident response.

Nice to Have

  • Experience with AWS (EKS, EC2, RDS, CloudWatch).
  • Strong understanding of networking concepts (TCP/IP, DNS, Load Balancing, CDN).
  • Experience with high availability and distributed systems.
  • Previously built a complete observability stack.
  • Experience in building or optimizing Golang SDKs or internal frameworks.
  • Knowledge of cloud-native networking (CNI, overlay, BGP, eBPF-based load balancing).

Phúc lợi

You'll find this place irresistible

Enjoy top-tier compensation, including:

  • Monthly NET take-home pay that leaves you smiling

    13th-month salary
  • Performance bonuses that could boost your income up to 02 months' salary

    24 remote working days per year
  • 12 days of annual paid leave
  • Flexible working time, from Monday to Friday; weekends are yours
  • Company trips and team bonding activities
  • Elevate your creativity and productivity in our modern workspace

Especially:

  • Shine like a rock star in our fast-growing global B2B SaaS squad
  • Blaze a trail to success with our super-fast career track
  • Collaborate with the brightest and coolest minds from across the globe
  • Be yourself, knowing you're valued and groomed to be your absolute best.

`