New
Senior Site Reliability Engineer
![]() | |
![]() | |
![]() United States, Washington, Redmond | |
![]() | |
OverviewHalo Studios is building the future of the blockbuster Halo series of video games with Unreal Engine. As part of Xbox Game Studios, the Halo franchise encompasses games, novels, comics, licensed collectibles, apparel, and more with a shared vision of heroism, mystery, and wonder. With multiple projects in development, join our team as we forge the next generation of games and experiences in our award-winning sci-fi universe. You will be a key member ofthe Halo StudiosIT Engineering team, responsible forstudio data andIT services and infrastructure within our studio. You will be contributing to the architecture anddesign of new on-prem and cloud infrastructure, while continuing to drive optimization, performance, security, and reliability withcutting edgetechnologies and automation. You will empower artists,developers, and others in our studio by proactivelydesigning technical solutions to maximize their efficiency. Along the way, you will be a trusted voice who shares your knowledge andexpertisewithin our team and other teams in the studio. You will be joining a fast-paced team that constantlyprovidesnew opportunities to learn and grow. Roles at our studio are flexible, and you can work from home up to two days a week in this role. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
ResponsibilitiesResponsibilities Architect, implement, and optimizecriticalhybrid and cloud-basedIT infrastructureutilizingInfrastructure as Code(IaC) technologies(e.g.,Docker, Terraform,AKS) to ensurehighavailability, scalability, security, and operational efficiency. Design, scale, and maintain Perforce, Swarm, and build farm infrastructure used by game development teams and automated build environments, to ensure robust, high-performance workflows across distributed game development environments. Designand implementAzure Networking solutionsincluding Site-to-Site tunnels,App Gateway, Private Endpoints/Private Link, DNS,and network security for Azure resources, ensuring secure and reliable connectivity. Architect anddeliver automation solutions to improve service health, manageability, reliability, telemetry,and alerting. Implementdata governance,storage,backup, and disaster recovery solutions for a multi-PetabyteAzure-basedenvironment, ensuringdata integrity,security, and performance. Research, evaluate, and integrate emerging tools and methodologies into the technology roadmap,tocontinuouslyoptimizeefficiency, reliability, and scalability. Produce andmaintainclear andaccuratetechnical documentation anddesignspecificationsthat align withbestpractices. Collaborate with software engineers, project management, and operations teams to improve andoptimizeinfrastructure and evolve services, ensuring alignment with organizational goals. Participate in on-call rotations, lead incident response, and conduct postmortems to identify root causes and implement preventative infrastructure improvements. |