Overview
Telemetry Operations Leader
The Cloud & AI organization accelerates Microsoft's mission and bold ambitions to ensure that our company and industry is securing digital technology platforms, devices, and clouds in our customers' heterogeneous environments, as well as ensuring the security of our own internal estate. Our culture is centered on embracing a growth mindset, a theme of inspiring excellence, and encouraging teams and leaders to bring their best each day. In doing so, we create life-changing innovations that impact billions of lives around the world. Microsoft is one of the largest enterprise service companies in the world. Aligning with Microsoft's mission and the focus of the Microsoft Security organization, this role is an integral part of a larger team dedicated to delivering world-class security operations that contain and evict threat actor activities. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.
ROLE OVERVIEW:
The Telemetry Operations Leader drives the operational backbone of the Telemetry Enforcement function, ensuring Microsoft's cyber defense ecosystem has timely, accurate, reliable, and predictable telemetry to support investigations, incident response, detection engineering, and threat hunting. This role stewards the "telemetry services factory": intake, prioritization, access brokering, dataset curation, operational inspection, and continuous improvement.
This leader establishes the operational rhythms, SLAs/SLOs, governance patterns, and monitoring frameworks that enable high-scale throughput while ensuring consistent quality, transparency, and stakeholder alignment across CDO, Security Operations, Data Engineering, Platform teams, and partner organizations.
Success is measured by improvements in cycle time, reliability, adherence to SLAs, clarity of access pathways, reduction of operational friction, and measurable uplift in analyst, investigator, and detection engineer productivity.
Responsibilities
Key Responsibilities:
1. Operational Leadership & Service Delivery
- Own day-to-day operations for telemetry access, brokering, curation, and monitoring, ensuring the function consistently meets SLA/SLO targets and service expectations across clouds, tenants, and data types.
- Maintain and evolve predictable access pathways (e.g., SHIELD patterns, service workflows), reducing friction points and handoffs for analysts and engineers.
- Drive operational excellence through structured inspection rhythms, backlog transparency, and standardization of request types and service catalog items-as highlighted in functional review action items.
2. Intake, Prioritization, & Stakeholder Alignment
- Oversee the endtoend intake triage greenlight delivery pipeline for telemetry requests, ensuring the highestimpact datasets and access paths are prioritized.
- Partner with upstream service teams to broker access accurately, escalate gaps (e.g., missing telemetry, schema issues), and drive engineering followthrough when generation is required-aligning with Data Pursuit and upstream responsibilities clarified in internal communications.
- Ensure prioritization decisions are transparent, valuedriven, and communicated broadly across the cyber defense ecosystem.
3. Telemetry Monitoring & Operational Health
- Lead the development and operationalization of monitoring frameworks for telemetry coverage, data freshness, critical failures ("quick and dirty monitoring"), and dependency health, consistent with CDO functional review expectations.
- Own dashboards and reporting for service health, cycle time, request volumes, SLA adherence, and failure clusters.
- Ensure rapid escalation paths for critical telemetry failures to Mystic, Ghost, CDO, and other technical owners.
4. Process Engineering & Continuous Improvement
- Systematize the operating model for data discovery, access, brokering, and curation; drive continuous improvements in throughput, firsttimeright delivery, and reduction of manual effort.
- Partner with engineering teams to expand telemetry coverage, enhance tooling and automation, and improve dataset curation quality-as described in TE operations and engineering responsibilities.
- Embed governance (RACI, decision logs, intake taxonomies, process documentation) and ensure the function evolves alongside changing investigative and detection needs.
5. CrossFunctional Coordination & Communication
- Serve as connective tissue across CDI, TAP, Detection Engineering, Data Engineering, Platform Engineering, and stakeholders.
- Deliver clear narratives, executive summaries, and progress updates for leadership, ensuring TEE's story is crisp, datadriven, and aligned to broader CDO priorities.
6. Team Leadership, Enablement & Culture
- Lead a multidisciplinary operations team (SOEs, Data Engineers) responsible for intake, brokering, monitoring, curation operations, and service health management.
- Foster a culture of operational rigor, customer orientation, and continuous improvement.
Success Criteria
- Telemetry access pathways are predictable, fast, reliable, and welldocumented.
- SLA adherence and cycletime metrics improve quarter over quarter.
- Stakeholders express clarity, confidence, and satisfaction with TE's operational predictability.
- Telemetry failures are identified and escalated quickly, with MTTR improvements.
- Request fulfillment quality increases and new datasets unlock investigative and detection value.
- The function operates with high transparency, clear governance, and strong crossorg alignment.
Qualifications
Required Qualifications:
- Doctorate in Statistics, Mathematics, Computer Science, or related field AND 3+ years experience in software development lifecycle, large-scale computing, threat modeling, cyber security, anomaly detection, Security Operations Center (SOC) detection, threat analytics, security incident and event management (SIEM), information technology (IT), or operations incident response OR Master's Degree in Statistics, Mathematics, Computer Science, or related field AND 4+ years experience in software development lifecycle, large-scale computing, threat modeling, cyber security, anomaly detection, Security Operations Center (SOC) detection, threat analytics, security incident and event management (SIEM), information technology (IT), or operations incident response
- OR Bachelor's Degree in Statistics, Mathematics, Computer Science, or related field AND 6+ years experience in software development lifecycle, large-scale computing, threat modeling, cyber security, anomaly detection, Security Operations Center (SOC) detection, threat analytics, security incident and event management (SIEM), information technology (IT), or operations incident response
- OR equivalent experience.
Other Requirements: Candidates must be able to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
- Microsoft Cloud Background Check:This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Preferred Qualifications:
- Experience operating in incident response / cyber defense environments where "incident pace" and role clarity are essential.
- Experience working with security governance models that distinguish risk ownership from execution, and managing the seams between them
- Demonstrated experience designing and operationalizing crossorg operating models, including RACI, decision rights, escalation, and governance forums.
- Proven ability to run a portfolio of stakeholder relationships and drive structured collaboration frameworks that reduce friction.
- Strong executive communication: ability to synthesize ambiguity into crisp narratives and decision points. Operational rigor and systems thinking (service rhythms, governance patterns, repeatable processes).
Security Operations Engineering IC5 - The typical base pay range for this role across the U.S. is USD $139,900 - $274,800 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $188,000 - $304,200 per year. Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled. Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.
|