← All jobs
WG

SENIOR MANAGER, SOFTWARE ENGINEERING

Walmart Global Tech

BengaluruTop payGCC
Apply on Walmart Global Tech

Research Walmart Global Tech before you apply

Check ratings, real-employee reviews, verified pay, and interview difficulty.

Position Summary...

What you'll do...

About Us  

As a leader within the Command and Control Center (CCC) of Walmart Global Technology - Reliability Engineering and Operations team, manage the Site Reliability Operations(SRO) team, you are responsible for driving a critical group of software engineers who handle resolution of Major Incidents for all of the diverse Walmart Business across Geographical Locations both in US and International.  You will engage closely with SRE, DevOps and Engineering practitioners to manage mission-critical infrastructure, tools, and processes that will ensure highest levels of availability and reliability of all our websites. As a senior member of the team you will be expected to work with management, peers, and customers to define and implement CCC's technical vision. 

You're right for the job if you are comfortable handling major incident response leading a technical team of engineers to resolve and restore service across complex, diverse distributed architectures both online and offline. You will work cross-functionally amongst a variety of teams and be a core contributor in every significant engineering service or solution that we deliver to our stakeholders. You'll excel if you have enthusiasm for digging deep, and a flare for sharp technical communication, prioritization and organization. You will work directly with our SRE and DevOps teams to manage our next generation “always up” cloud based e-commerce platform.  

What will you do

The Manager is responsible for managing a team of engineers through daily proactive issue detection and resolving issues before they become impacting. Manage engineer workloads, performance and deliver clear direction to the SRO team. Responsible for customer follow up, control difficult calls, provide performance metrics and demonstrate expertise within Service Management processes and procedures to manage service impacting incidents. Our goal is to build, scale and guard the systems that delight our customers.  

  • Responsible for immediate coordinated response of critical incidents to reduce impact and increase availability. 
  • Responsible for development of critical tools that assist in quick detection, auto-remediation of incidents. 
  • Responsible for development of monitoring and alerting frameworks. 
  • Leads the resolution of high complexity Incidents as required. 
  • Responsible for leadership and communications between the business customer and technology teams. 
  • Identify and recommend processes or system enhancements for the SRO. 
  • Deep understanding of incident management processes and procedures. 
  • Focus on internal and external customer requirements (SLA’s & KPI’s) 
  • Demonstrate advanced understanding of business processes being supported by assigned system(s) 
  • Develop clear tactical and strategic goals for the CCC related to function, capabilities and capacities. 
  • Make recommendations regarding improving situational awareness and alerting to potential business impacts, either internal or external influencers . 
  • Manages the analysis, communication and resolution of incidents. 
  • Manages others in researching and recommending alternative actions for incident resolution. 
  • Analyze trends to proactively prevent incidents and to provide historical summary reports. 
  • Mentor and grow talent within your team to build a best in class CCC function. 
  • Calm under pressure orchestrating major incident response to mission critical systems. 
  • Function as part of a global CCC management team to deliver continuous improvement. 
  • Excellent communication and stakeholder management skills. 
  • Technically strong within infrastructure or software engineering. 
  • Ability to assess system impact and formulate accurate problem statements to distribute across the management and technical communities. 

  

Additional responsibilities include: 

  

  • Develop a deep understanding of the various services and applications that come together to deliver Walmart e-commerce products 
  • Monitor and discover failures/issues in a timely fashion and work with engineers to identify root cause and fix issues 
  • Root-cause analysis complex problems involving multiple parties, networks, hardware, software and cloud technologies. 
  • High focus on collecting and inferring metrics. 
  • Identify and drive the automation of systems that maintain system and application health. 
  • Drives standardization and service focused instrumentation to resolve break/fix scenarios, engaging broader teams where necessary. 
  • Contributes to command and control related activities focused on restoration of complex outages. 
  • May work independently or as part of a team on more complex projects. 
  • Provides mentoring and guidance to the team. 
  • Networking responsibilities: Understanding and performing TCP dumps, snoop, and other network sniffers. Understands and applies knowledge of most protocols (TCP/IP, HTTP, UDP, etc.) 
  • Application Technologies: Provides recommendations and advice to the team and/or department in the areas of web services, OS, and storage, including being an active liaison to Development, QA and the Business. 
  • Analyzes systems and makes recommendations to prevent possible incidents using knowledge of complex and company-wide systems. 
  • Lead end-to-end audit of monitors and alarms based on subsystem knowledge. 
  • Utilizes time management and project management skills to lead the resolution of incidents in a timely and organized manner, effectively communicating necessary information. May consult directly with developers or third party vendors; provides subject matter expertise. 

What will you bring

  • 10+ years in an infrastructure or systems environment delivering operational excellence to highly complex distributed systems. 
  • Experience in leading and troubleshooting service impacting incidents across large- scale enterprise systems. 
  • Methodical and systematic problem solving approach, combined with a solid awareness of ownership, initiative and drive. 
  • Experience controlling and leading a team to deliver in highly pressurized situations delivering clear and concise communication to partners and stakeholders. 
  • Experience of command and control tools in a production environment. 
  • Networking knowledge and understanding of network concepts, such as different protocols (TCP/IP, UDP, ICMP, etc.), MAC addresses, IP packets, DNS, OSI layers, and load balancing). 
  • =Understanding of Unix/Linux systems from kernel to shell and beyond, taking in system libraries, file systems, and client-server protocols along the way. 
  • Experience administering Linux systems in a production environment 
  • Programming experience in one or more of the following languages: Go, Python, Java, Ruby, Shell 
  • Bachelor's Degree in Computer Science or a related field, or relevant work experience 
  • Experience with cloud technologies such as AZURE OpenStack 
  • Experience with enterprise monitoring solutions like Prometheus, Graphite, Nagios, Sensu, Splunk, Grafana and Dynatrace. 
  • Experience in data science/machine learning would be advantageous. 

About Walmart Global Tech
Imagine working in an environment where one line of code can make life easier for hundreds of millions of people. Thats what we do at Walmart Global Tech. Were a team of software engineers, data scientists, cybersecurity experts and service professionals within the worlds leading retailer who make an epic impact and are at the forefront of the next retail disruption. People are why we innovate, and people power our innovations. We are people-led and tech-empowered. We train our team in the skillsets of the future and bring in experts like you to help us grow. We have roles for those chasing their first opportunity as well as those looking for the opportunity that will define their career. Here, you can kickstart a great career in tech, gain new skills and experience for virtually every industry, or leverage your expertise to innovate at scale, impact millions and reimagine the future of retail.
 

We’re back to work

Walmart’s culture sets us apart, and we know being together helps us innovate, learn and grow great careers. This role is based in our Bangalore office for daily work, with flexibility for associates to manage their personal lives.


Benefits:
Benefits: Beyond our great compensation package, you can receive incentive awards for your performance. Other great perks include 401(k) match, stock purchase plan, paid maternity and parental leave, PTO, multiple health plans, and much more.

Equal Opportunity Employer:
Walmart, Inc. is an Equal Opportunity Employer – “ By Choice. We believe we are best equipped to help our associates, customers and the communities we serve live better when we really know them. That means understanding, respecting, and valuing unique styles, experiences, identities, ideas and opinions while being inclusive of all people.

Minimum Qualifications...

Outlined below are the required minimum qualifications for this position. If none are listed, there are no minimum qualifications.

Option 1: Bachelor's degree in computer science, computer engineering, computer information systems, software engineering, or related area and 5 years’ experience in software engineering or related area.
Option 2: 7 years’ experience in software engineering or related area. 2 years’ supervisory experience.

Preferred Qualifications...

Outlined below are the optional preferred qualifications for this position. If none are listed, there are no preferred qualifications.

Master’s degree in computer science, computer engineering, computer information systems, software engineering, or related area and 3 years' experience in software engineering or related area., We value candidates with a background in creating inclusive digital experiences, demonstrating knowledge in implementing Web Content Accessibility Guidelines (WCAG) 2.2 AA standards, assistive technologies, and integrating digital accessibility seamlessly. The ideal candidate would have knowledge of accessibility best practices and join us as we continue to create accessible products and services following Walmart’s accessibility standards and guidelines for supporting an inclusive culture.

Primary Location...

Building 10 (sez), Cessna Business Park, Kadubeesanahalli Village, Varthur Hobli , India