New
Member of Technical Staff
Microsoft | |
United States, Washington, Redmond | |
Oct 24, 2025 | |
|
OverviewThe MicrosoftCoreAIPost-Trainingteamis dedicated toadvancingpost-training methods for both OpenAI and open-source models. Their workencompassescontinual pre-training,large-scale deep reinforcement learningrunning onextensive GPU resources,andsignificant efforts tocurateand synthesizetraining data.In addition, the team employs various fine-tuning approaches to support both research and product development. The team alsodevelops advanced AI technologies that integrate language and multi-modality fora range ofMicrosoft products.The team isparticularlyactivein developingcode-specificmodels, includingthose usedinGithubCopilot and Visual Studio Code, such as code completionmodelandthesoftware engineering (SWE)agentmodels. The team has also produced publications as by-products, includingworksuch asLoRA,DeBerTa, Oscar, Rho-1, Florence, and the open-source Phi models. We are looking for a highly skilledAI Data & Training Technical Staffto join our team and help push the boundaries of large-scale AI. In this role, you'll be at the forefront of creating world-class datasets, training front-tier models, developing scalable data pipelines, and driving experiments that directly impact the performance of cutting-edge language and multimodal models. Our work is at the intersection ofresearch, data engineering, and AI model training,and Products. Our team values startup-style efficiency and practical problem-solving.We areseekinga curious, adaptable problem-solver who thrives on continuous learning, embraces changing priorities, and is motivated by creating meaningful impact.Candidates must be self-driven, able towrite efficientcode and debug training jobs, document findings, anddemonstrateatrack recordin these fields.You may include information about any individualswho canserve asyour referral in yourapplication. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positivelyimpactour culture every day.
ResponsibilitiesDesign & Evaluate Datasets- Build high-quality datasets and benchmarks for training AI models; run ablation studies to measure impact and optimize data effectiveness. Advance Model Training- Apply deep expertise in pre-training, post-training, and reinforcement learning (RL) for both language and multimodal models. Develop Data Infrastructure- Create and maintain scalable pipelines for ingestion, preprocessing, filtering, and annotation of large, complex datasets. Data Quality & Analysis- Assess real-world multimodal datasets (text, image, video, audio, code) for quality, diversity, and relevance; identify gaps and propose improvements. Tooling & Workflows- Build lightweight tools for dataset auditing, visualization, and versioning to streamline experimentation. Research & Innovation- Collaborate with cross-functional teams to push research and product boundaries, delivering models that make a real-world impact. Other: Embody our Culture and Values | |
Oct 24, 2025