Neurons Lab Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia
AI Cloud Solution Architect & Engineer
Join Neurons Lab as an AI Cloud Solution Architect & Engineer – a unique hybrid role combining strategic solution design with hands‑on engineering execution.
Location : Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia
Duration : Part‑time long‑term engagement with project‑based allocations
Reporting : Direct report to Head of Cloud
About the Project
We specialize in serving Banking, Financial Services, and Insurance (BFSI) enterprise customers with stringent compliance, security, and regulatory requirements. This role is perfect for technical professionals who love both the “what” and the “how” – architecting elegant solutions and rolling up their sleeves to code, deploy, and optimize them.
Objective
- Architecture & Design : Gather requirements, design cloud architectures, calculate ROI, and create technical proposals for AI / ML solutions
- Engineering Excellence : Build production‑grade infrastructure using IaC, develop APIs and prototypes, implement CI / CD pipelines, and manage AI workload operations
- Client Success : Transform business requirements into working solutions that are secure, scalable, cost‑effective, and aligned with AWS best practices
- Knowledge Transfer : Create reusable artifacts, comprehensive documentation, and architectural patterns that accelerate future project delivery
KPI
Architecture & Pre‑Sales
Design and document 3+ solution architectures per month with comprehensive diagrams and specificationsAchieve 80%+ client acceptance rate on proposed architectures and estimatesDeliver ROI calculations and cost models within 2 business days of requestEngineering Delivery
Deploy infrastructure through IaC (AWS CDK / Terraform) with zero manual configurationCreate at least 3 reusable IaC components or architectural patterns per quarterImplement CI / CD pipelines for all projects with automated testing and deploymentMaintain 95%+ uptime for production AI / ML inference endpointsDocument architecture and implementation details weekly for knowledge sharingQuality & Best Practices
Ensure all solutions pass AWS Well‑Architected Review standardsDeliver comprehensive documentation within 1 week of architecture completionCreate simplified UIs / demos for PoC validation and client presentationsAreas of Responsibility
Solution Architecture (40%)
Elicit and document business and technical requirements from clientsDesign end‑to‑end cloud architectures for AI / ML solutions (training, inference, data pipelines)Create architecture diagrams, technical specifications, and implementation roadmapsEvaluate technology options and recommend optimal AWS services for specific use casesBusiness Analysis
Calculate ROI, TCO, and cost‑benefit analysis for proposed solutionsEstimate project scope, timelines, team composition, and resource requirementsParticipate in presales activities : technical presentations, demos, and proposal supportCollaborate with sales team on SOW creation and customer workshopsStrategic Planning
Design for scalability, security, compliance, and cost optimization from day oneCreate reusable architectural patterns and reference architecturesStay current with AWS AI / ML services and emerging cloud technologiesCloud Engineering & AI Infrastructure (60%)
Infrastructure as Code Development
Build and maintain cloud infrastructure using AWS CDK (primary) and TerraformDevelop reusable IaC components and modules for common patternsImplement infrastructure for AI / ML workloads : GPU clusters, model serving, data lakesManage compute resources : EC2, ECS, EKS, Lambda, SageMaker compute instancesApplication Development
Develop Python applications : FastAPI backends, data processing scripts, automation toolsCreate prototype interfaces using Streamlit, React, or similar frameworksBuild and integrate RESTful APIs for AI model serving and data accessImplement authentication, authorization, and API security best practicesAI / ML Operations (MLOps)
Deploy and manage AI / ML model serving infrastructure (SageMaker endpoints, containerized models)Build ML pipelines : data ingestion, preprocessing, training automation, model deploymentImplement model versioning, experiment tracking, and A / B testing frameworksManage GPU resource allocation, training job scheduling, and compute optimizationMonitor model performance, inference latency, and system health metricsDevOps & Automation
Design and implement CI / CD pipelines using GitHub Actions, GitLab CI, or AWS CodePipelineAutomate deployment processes with infrastructure testing and validationImplement monitoring, logging, and alerting using CloudWatch, Prometheus, GrafanaManage containerization with Docker and orchestration with Kubernetes / ECSData Engineering
Build data pipelines for AI training and inference using AWS Glue, Step Functions, LambdaDesign and implement data lakes using S3, Lake Formation, and data catalogingImplement automated and scheduled data synchronization processesOptimize data storage and retrieval for ML workloadsSecurity & Compliance
Implement cloud security best practices : IAM, VPC design, encryption, secrets managementBuild enterprise security and compliance strategies for AI / ML workloadsEnsure solutions meet regulatory requirements (PCI-DSS, GDPR, SOC2, MAS TRM, etc where applicable)Conduct security reviews and implement remediation strategiesCost & Performance Optimization
Optimize cloud spend for compute-intensive AI workloadsImplement spot instance strategies, auto-scaling, and resource schedulingMonitor and optimize GPU utilization, inference latency, and throughputPerform cost analysis and implement cost-saving measuresOperations & Support
Implement disaster recovery procedures for AI models and training dataManage backup strategies and business continuity planningTroubleshoot and resolve production issues in AI infrastructureProvide technical guidance to project teams during implementationSkills
Cloud Architecture & Design
Strong solution architecture skills with ability to translate business requirements into technical designsExperience in Well‑Architected review and remediationDeep understanding of AWS services, particularly compute, storage, networking, and AI / ML servicesExperience designing scalable, highly available, and fault‑tolerant systemsAbility to create clear architecture diagrams and technical documentationCost modeling and ROI calculation capabilitiesTechnical Leadership
Comfortable leading technical discussions with clients and stakeholdersAbility to guide engineers and share knowledge effectivelyStrong problem‑solving and analytical thinking skillsExperience with architectural decision‑making and trade‑off analysisProgramming & Development
Advanced Python programming : object‑oriented design, async programming, testingAPI development with FastAPI, Flask, or similar frameworksFrontend development basics : React, etc (for prototypes and demos with AI code generation tools)Shell scripting for automation and deploymentGit version control and collaborative development workflowsInfrastructure as Code
AWS CDK (required) - CloudFormation experience is valuableTerraform (highly preferred) for multi‑cloud or hybrid scenariosUnderstanding of IaC best practices : modularity, reusability, testingExperience with infrastructure testing and validation frameworksAI / ML Infrastructure
Hands‑on experience with AWS SageMaker : training jobs, endpoints, pipelines, notebooksUnderstanding of ML lifecycle : data preparation, training, deployment, monitoringExperience with GPU management and optimization for training / inferenceKnowledge of containerization for ML models (Docker, container registries)Familiarity with ML frameworks : PyTorch, TensorFlow, LangChain, Llamaindex, etcDevOps & Automation
CI / CD pipeline design and implementation (GitHub Actions, GitLab CI, AWS CodePipeline)Container orchestration : Docker, Kubernetes, Amazon ECSConfiguration management and deployment automationMonitoring and observability : CloudWatch, Prometheus, Grafana, ELK stackCommunication & Collaboration
Excellent written and verbal communication in Advanced EnglishAbility to explain complex technical concepts to non‑technical stakeholdersComfortable with client‑facing presentations and technical demosStrong documentation skills with attention to detailCollaborative mindset with ability to work across functional teamsProblem‑Solving
Advanced task breakdown and estimation abilitiesDebugging and troubleshooting complex distributed systemsPerformance optimization and tuningIncident response and root cause analysisKnowledge
AWS Cloud Platform (Required)
AWS Certified Solutions Architect Associate (minimum requirement)AWS Certified Solutions Architect Professional or AWS Certified Machine Learning - Specialty (highly preferred)Deep knowledge of core AWS services :Compute : EC2, Lambda, ECS, EKS, SageMakerStorage : S3, EFS, EBS, FSxNetworking : VPC, Route53, CloudFront, API Gateway, Load BalancersAI / ML : SageMaker, Bedrock, Rekognition, Textract, Comprehend, Lex, PollyData : RDS, DynamoDB, Redshift, Glue, Athena, KinesisSecurity : IAM, KMS, Secrets Manager, Security Hub, GuardDutyDevOps : GitHub Action, CodePipeline, CodeBuild, CodeDeploy, CloudFormation, CDK, TerraformAI / ML Technologies
Understanding of machine learning concepts and model training / deployment lifecycleFamiliarity with Generative AI technologies : LLMs, RAG, vector databases, prompt engineeringKnowledge of ML frameworks and libraries : PyTorch, TensorFlow, scikit‑learn, pandas, numpyExperience with MLOps practices and toolsUnderstanding of model serving patterns : real‑time vs batch inferenceSoftware Development
Modern software development practices : testing, code review, documentationAPI design principles : RESTful, GraphQL, event‑driven architecturesDatabase design and optimization : SQL and NoSQLAuthentication and authorization : OAuth, JWT, IAMDevOps & Infrastructure
Linux / UNIX system administrationNetworking fundamentals : TCP / IP, DNS, HTTP / HTTPS, load balancingSecurity best practices for cloud environmentsDisaster recovery and business continuity planningIndustry Knowledge
Understanding of cloud consulting delivery modelsFamiliarity with agile / scrum methodologiesAwareness of compliance frameworks : GDPR, HIPAA, SOC2, ISO27001Knowledge of FinTech, or other regulated industries (plus)Additional Knowledge (Preferred)
Azure or GCP certifications and experienceMulti‑cloud architecture patternsServerless architecture patternsData engineering and data lake designCost optimization strategies and FinOps practicesExperience
Cloud Engineering & Architecture
5+ years in cloud engineering, DevOps, or solution architecture roles3+ years hands‑on experience with AWS services and architectureProven track record of designing and implementing cloud solutions from scratchExperience with both greenfield projects and cloud migration initiativesAI / ML Infrastructure
2+ years working with AI / ML workloads on cloud platformsHands‑on experience deploying and managing ML models in productionExperience with GPU‑based compute for training or inferenceUnderstanding of AI / ML infrastructure challenges and optimization techniquesInfrastructure as Code
3+ years building infrastructure using IaC tools (AWS CDK, Terraform, CloudFormation)Experience creating reusable IaC modules and componentsTrack record of infrastructure automation and standardizationSoftware Development
4+ years programming experience in Python (required)Experience building APIs with FastAPI, Flask, or similar frameworksHistory of creating prototypes, MVPs, or PoC applicationsComfortable with full‑stack development for demos and prototypesDevOps & Automation
3+ years implementing CI / CD pipelines and deployment automationExperience with containerization (Docker) and orchestration (Kubernetes / ECS)Linux / UNIX system administration experienceMonitoring and observability implementationClient‑Facing Work
Experience gathering requirements and translating them into technical solutionsHistory of presenting technical architectures to clients and stakeholdersParticipation in presales activities, demos, or technical workshopsAbility to work directly with customers to solve complex problemsIndustry Experience (Preferred)
Consulting or professional services backgroundExperience in regulated industries (FinTech, Insurance, Banks)Work with enterprise clients on large‑scale implementationsStartup or fast‑paced environment experienceSeniority level
Mid‑Senior level
Employment type
Full‑time
Job function
Engineering and Information Technology
Industries
IT Services and IT Consulting
#J-18808-Ljbffr