Talent.com
DevOps Engineer (Remote)

DevOps Engineer (Remote)

FreelancingSeberang Perai, Penang, Malaysia
30+ days ago
Job description

Cloud / DevOps Engineering, Proficient in scripting (e.g., Python, Bash, or PowerShell); Go / Rust is a plus, Strong expertise in Terraform, Terragrunt, Helm, Kubernetes, and Docker

About Company

Groundup.ai is a Singapore-based AI startup that helps companies to reduce unplanned downtime of industrial assets without needing a huge learning curve and high-risk deployments on the ground.

Job Description

  • Architect and manage scalable, secure infrastructure on GCP, Azure, and occasionally OCI / AWS.
  • Implement and manage Infrastructure as Code (IaC) primarily using Terraform and occasionally with Terragrunt, and Helm.
  • Design and optimize CI / CD workflows using GitHub Actions, Jenkins, and GitHub Enterprise (reusable workflows, OIDC federation).
  • Ensure seamless deployment pipelines from code commit to production for microservices and AI workloads.
  • Manage Docker containers using tools such as Portainer, Docker Image.
  • Support canary releases, blue-green deployments, and auto-scaling strategies.
  • Implement and manage serverless deployments on Google Cloud Platform (Cloud Functions, Cloud Run).

Resource Planning & Hardware Estimation

  • Assist in hardware estimation for both on-premise and cloud environments, based on resource requirements such as the number of sensors and storage needs.
  • Ensure robust backup strategies and data redundancy for all infrastructure.
  • Assist the team in auditing the on-cloud and on-premises resources.
  • Security & Compliance

  • Enforce cloud security best practices : image hardening, secret management, IAM least privilege, SBOMs, and vulnerability scanning.
  • Collaborate on compliance requirements (SOC 2, ISO 27001), and respond to audits and incidents proactively.
  • Configure and manage Cloudflare for enhanced security and performance.
  • Build and maintain observability stacks using Grafana, Prometheus, Loki, Tempo, Datadog, OpenTelemetry, and Sentry.
  • Diagnose and resolve performance bottlenecks across compute, storage, and networking layers.
  • Monitor and optimize cloud spending to ensure cost-efficiency.
  • Develop and implement disaster recovery plans, conducting regular drills to ensure business continuity.
  • Partner with engineers to embed DevOps best practices.
  • Establish and enforce documentation standards for infrastructure, processes, and troubleshooting guides.
  • Use Plane for sprint planning, incident tracking, and delivery visibility.
  • #J-18808-Ljbffr

    Create a job alert for this search

    Engineer • Seberang Perai, Penang, Malaysia