New

Staff Machine Learning Engineering (Remote)

Cisco Systems, Inc.
$193,800.00 to $245,300.00
life insurance, vision insurance, parental leave, paid holidays, sick time, 401(k)
United States, Washington, Seattle
Feb 06, 2026
The application window is expected to close on: 02/28/2026 Job posting may be removed earlier if the position is filled or if a sufficient number of applications are received. This role can be performed remotely from locations within the United States. Meet the Team Splunk, a Cisco company, is building a safer, more resilient digital world with an endtoend, fullstack platform designed for hybrid, multicloud environments. TheSplunkAI Platform and Services team provides the core runtime and developer experience thatpowerAI across Splunk and Cisco.We manage large-scale, multi-tenant LLM inference across major cloud providers and build platform services to support these workloads. We also provideVectorDB/RAG services and MCP services that make AI workloads secure, observable, and cost-efficient for product teams. On top of this foundation, we deliver agentic frameworks, SDKs, tools, and evaluation/guardrail capabilities that help teams quickly build reliable GenAI assistants and automation features. You'll join a group that sits at the intersection of distributed systems, ML, and developer experience, grounded in operational excellence and a culture of impact-driven, cross-functional collaboration. Your Impact Lead the end-to-end architecture for key areas of the AI Platform: multi-tenant LLM serving (vLLM/Ray), routing and orchestration layers, VectorDB/RAG integration, and agentic/SDK surfaces used by product teams. Design and drive implementation of high-scale inference services, including parallelism strategies (TP/PP/EP/MoE), autoscaling policies, and cross-region capacity management for GPU/CPU workloads. Optimize latency, throughput, and cost for large-scale LLM and generative workloads using techniques such as batching, chunked prefills, caching, and mixed precision. Design and tune distributed inference configurations (TP/PP/EP/MoE), across multi-GPU and multi-node clusters and modern GPU architectures. Implement platform capabilities such as telemetry, metering & throttling, guardrails, and rollout/rollback to ensure AI services are safe, observable, and multi-tenant by default. Lead the design of GenAI application services-chat assistants, and automation APIs, grounded in robust RAG pipelines, agentic workflows (LangChain/LangGraph or similar), and MCP-based tool ecosystems. Drive operational excellence with runbooks, readiness checklists, CI/CD safeguards, on-call rotations, and post-incident improvements. Provide technical mentorship and leadership for senior and mid-level engineers: review designs, guide trade-offs around quality/latency/COGS, and help grow the next generation of tech leads. Collaborate closely with applied scientists to productionize new models and techniques, ensuring that research prototypes become robust, observable, and cost-efficient services. Minimum Qualifications: Bachelor's degree in computer science, Engineering, or equivalent practical experience. 8+ years of hands-on experience building and operating backend or distributed systems in production or 5+ years of experience with a Master's degree, or 3+ years with a PhD Proven track record as a technical lead for complex systems: driving architecture, aligning stakeholders, and delivering high-impact projects end-to-end. Strong proficiency in at least one modern programming language (e.g., Python, Go, or Java) and deep experience with software design, debugging, and performance tuning. Significant experience with cloud-native architectures (containers, Kubernetes, service discovery, configuration management, CI/CD) and building reliable microservices (REST/gRPC). Demonstrated ownership of production services at scale, including on-call participation, incident response, and post-incident/RCAs that led to concrete improvements. Preferred Qualifications: Hands-on experience running LLM or deep learning inference at scale using frameworks such as vLLM, TensorRT-LLM, Triton Inference Server, or similar. Deep understanding of GPU and distributed systems performance: latency/throughput trade-offs, pipelining, model parallelism (TP/PP/EP/MoE), mixed precision (BF16/FP8/nvFP4), and profiling tools. Experience designing and operating RAG systems and GenAI application layers: document ingestion, chunking/embedding strategies, metadata design, hybrid retrieval, context ranking, and evaluation of retrieval quality. Practical experience with agentic frameworks (LangChain, LangGraph, LlamaIndex, Semantic Kernel, or similar) and multi-agent coordination, including integration with MCP tools and internal/external APIs. Background building platform or Developer experiences capabilities-shared services, SDKs, templates, micro-frontends-that are adopted by multiple product teams. Familiarity with LangSmith or similar evaluation platforms, including experiment design, offline/online evals, hallucination/groundedness metrics, and feedback loops. Strong knowledge of AWS or Azure or GCP (EC2/VMs, IAM roles/ARNs/principals, VPC networking, security best practices) for AI workloads. Experience defining and monitoring dashboards, and alerts for high-availability systems using Prometheus, Grafana, or cloud-native tooling. Excellent communication and collaboration skills, comfortable influencing cross-functional partners and other senior engineers, and explaining trade-offs between quality, latency, and cost to both technical and non-technical audiences. Why Cisco? At Cisco, we're revolutionizing how data and infrastructure connect and protect organizations in the AI era - and beyond. We've been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint. Simply put - we power the future. Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and you'll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere. We are Cisco, and our power starts with you. Why Cisco? At Cisco, we're revolutionizing how data and infrastructure connect and protect organizations in the AI era - and beyond. We've been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint. Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and you'll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere. We are Cisco, and our power starts with you. Message to applicants applying to work in the U.S. and/or Canada: The starting salary range posted for this position is $193,800.00 to $245,300.00 and reflects the projected salary range for new hires in this position in U.S. and/or Canada locations, not including incentive compensation, equity, or benefits. Individual pay is determined by the candidate's hiring location, market conditions, job-related skillset, experience, qualifications, education, certifications, and/or training. The full salary range for certain locations is listed below. For locations not listed below, the recruiter can share more details about compensation for the role in your location during the hiring process. U.S. employees are offered benefits, subject to Cisco's plan eligibility rules, which include medical, dental and vision insurance, a 401(k) plan with a Cisco matching contribution, paid parental leave, short and long-term disability coverage, and basic life insurance. Please see the Cisco careers site to discover more benefits and perks. Employees may be eligible to receive grants of Cisco restricted stock units, which vest following continued employment with Cisco for defined periods of time. U.S. employees are eligible for paid time away as described below, subject to Cisco's policies: 10 paid holidays per full calendar year, plus 1 floating holiday for non-exempt employees 1 paid day off for employee's birthday, paid year-end holiday shutdown, and 4 paid days off for personal wellness determined by Cisco Non-exempt employees* receive 16 days of paid vacation time per full calendar year, accrued at rate of 4.92 hours per pay period for full-time employees Exempt employees participate in Cisco's flexible vacation time off program, which has no defined limit on how much vacation time eligible employees may use (subject to availability and some business limitations) 80 hours of sick time off provided on hire date and each January 1st thereafter, and up to 80 hours ofunused sick timecarried forwardfrom one calendar yearto the next Additional paid time away may be requested to deal with critical or emergency issues for family members Optional 10 paid days per full calendar year to volunteer For non-sales roles, employees are also eligible to earn annual bonuses subject to Cisco's policies. Employees on sales plans earn performance-based incentive pay on top of their base salary, which is split between quota and non-quota components, subject to the applicable Cisco plan. For quota-based incentive pay, Cisco typically pays as follows: .75% of incentive target for each 1% of revenue attainment up to 50% of quota; 1.5% of incentive target for each 1% of attainment between 50% and 75%; 1% of incentive target for each 1% of attainment between 75% and 100%; and Once performance exceeds 100% attainment, incentive rates are at or above 1% for each 1% of attainment with no cap on incentive compensation. For non-quota-based sales performance elements such as strategic sales objectives, Cisco may pay 0% up to 125% of target. Cisco sales plans do not have a minimum threshold of performance for sales incentive compensation to be paid. The applicable full salary ranges for this position, by specific state, are listed below: New York City Metro Area: $212,300.00 - $317,100.00 Non-Metro New York state & Washington state: $193,800.00 - $282,100.00 * For quota-based sales roles on Cisco's sales plan, the ranges provided in this posting include base pay and sales target incentive compensation combined. ** Employees in Illinois, whether exempt or non-exempt, will participate in a unique time off program to meet local requirements.