We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results

Site Reliability Engineer, Platform Solutions, Enterprise Partnerships

The Goldman Sachs Group
United States, Texas, Richardson
Nov 16, 2024

Responsibilities



  • Work across Enterprise Partnership teams to architect, design and implement strategic business initiatives that enable GS to provide best customer experience and banking solutions.
  • Collaborate with other internal and external product managers and diverse cross-functional teams to ensure cohesiveness of the overall end-to-end experiences
  • Perform site reliability engineering duties including monitoring SLOs, incident management, troubleshooting, and building/implementing tools related to observability
  • Create and support automation solutions to improve the reliability of the platform and to increase the productivity of the team.
  • Assess monitoring & alert signals to determine impact and risk to the business and help steer the incident management process and a blameless port-mortem culture.
  • Define, Measure and continuously optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve.
  • Perform timely escalation of critical issues and proactively identify patterns of recurring issues to improve production support
  • Provide primary operational support and engineering for multiple large, distributed software applications.
  • Work with core banking platforms such as Corecard & Finacle to prioritize their product development roadmap and get alignment with GS business & technological initiatives.
  • Work closely with Business and Product to identify market opportunities, build business cases for new features, products, and revenue streams that contribute to our digital strategy
  • Assist Internal Audit to review the effectiveness of internal controls and build more robust controls to prevent customer, operations, regulatory and reputational risk.



Qualifications



  • 5+ years of prior work experience in SRE and Production support role as an individual contributor.
  • Advanced degree in Computer Science, Computer Engineering, or related field
  • Experience with distributed systems design, maintenance, and troubleshooting.
  • Familiarity with database concepts (e.g., SQL Server, Sybase, Oracle, Dynamo DB or PostgreSQL)
  • Experience in implementing monitoring systems and dashboards from logs, metrics, and telemetry (Grafana, Prometheus, Splunk, Datadog, Kibana, Job Schedulers etc)
  • Familiarity in working on retail/consumer banking platforms.
  • Proven ability to deescalate difficult situations with customers, while multi-tasking between tickets and mentoring your team
  • Ability to successfully manage your time, balancing multiple tasks with varying levels of priority and urgency
  • Strong verbal and written communication skills.
  • Experience with cloud computing and running applications at scale.
  • Ability to pick up new software, frameworks and APIs quickly
  • Experience in ITIL, DevOps and SRE ecosystem


Applied = 0

(web-5584d87848-llzd8)