Responsibilities
- Design and develop a robust infrastructure observability solution using platforms such as Splunk, Dynatrace, or ElasticSearch to enable real‑time monitoring, logging, and tracing at scale across diverse, distributed client architectures.
- Integrate observability tools with existing service management tools (e.g., ServiceNow) and monitoring systems (Prometheus, Dynatrace, Zabbix) using scripts and integration plugins to collect and correlate metrics, logs, and traces for unified visibility.
- Develop automation workflows for incident response using ServiceNow and infrastructure automation platforms to streamline issue resolution.
- Configure and leverage AI and machine learning to enable predictive analytics, anomaly detection, and automated root‑cause analysis within observability platforms.
- Design and create dashboards using SQL‑based query language, alerts, and reports to provide actionable insights into system performance, availability, and user experience.
- Work closely with clients to assess their observability needs, provide strategic recommendations, and implement tailored solutions aligned with business objectives.
- Diagnose and resolve issues with observability platforms to ensure platform reliability and performance.
- Provide ongoing support and troubleshooting post‑deployment to address any issues that arise and ensure tools continue to operate effectively.
Qualifications
3–5 years of consulting experience in IT Service Management (ITSM) processes, infrastructure resiliency, or IT operations automation.Hands‑on experience with observability platforms such as Splunk, Dynatrace, or the ELK Stack.Proficiency in Python, Bash, or PowerShell for scripting and automation.Familiarity with AI / ML concepts for anomaly detection, event correlation, and predictive analytics.Excellent problem‑solving, communication, and client‑facing skills to collaborate with cross‑functional teams and stakeholders.Experience in SRE roles with a focus on observability and automation is an advantage.Seniority level
Mid‑Senior level
Employment type
Full‑time
Job function
Strategy / Planning and Information Technology
Industries
Business Consulting and Services
#J-18808-Ljbffr