AI Summary
Lead technical troubleshooting for P1/P2 incidents, conduct deep-dive investigations, and ensure system stability and rapid incident resolution.
Key Highlights
Incident & Problem Management
Application & Infrastructure Monitoring
Release, Deployment & DevOps
Continuous Improvement & SME Activities
Vendor Transition & Knowledge Transfer
Technical Skills Required
Benefits & Perks
Annual salary: $100,890 - $186,500
Discretionary annual incentive program
Medical/Dental/Vision/Life Insurance
Paid holidays plus Paid Time Off
401(k) plan and contributions
Long-term/Short-term Disability
Paid Parental Leave
Employee Stock Purchase Plan
Job Description
- Please note this role is not able to offer visa transfer or sponsorship now or in the future
In a Managed Services environment, the technical lead acts as the bridge between Level 1/2 support teams and the core engineering/development teams. Primary goal is ensuring system stability, rapid incident resolution, and continuous operational improvement for the Digital Commerce platform.
Below is a breakdown of day-to-day responsibilities, categorized by function, specifically tailored to the Spring Boot (GCP) and Adobe Experience Manager (AEM) stack.
We believe remote is the way forward as we strive to provide flexibility wherever possible. Based on this role’s business requirements, if this is a hybrid position, it will require 3-4 days a week in the client or Cognizant office. Regardless of your working arrangement, we are here to support a healthy work-life balance though our various wellbeing programs.
The working arrangements for this role are accurate as of the date of posting. This may change based on the project you’re engaged in, as well as business and client requirements. Rest assured; we will always be clear about role expectations.
Location: Remote
Role & Responsibilities
Incident & Problem Management
- High-Severity Incident Command: Lead technical troubleshooting for P1/P2 incidents. You are the "go-to" person when the site goes down or checkout fails.
- Root Cause Analysis (RCA): Conduct deep-dive investigations into recurring issues.
- Tech Specifics: Analyze Spring Boot logs (via Splunk/ELK/Cloud Logging) to trace 5xx errors. Investigate Kafka dead-letter queues to find stuck orders or failed messages.
- Query MySQL to verify transactional integrity (e.g., "Why did the order status not update?").
- Inspect MongoDB collections for product catalog inconsistencies.
- SLA Management: Ensure response and resolution times meet the contractual Service Level Agreements (SLAs).
Proactive Health Checks
- Monitor GCP dashboards (Stackdriver/Cloud Monitoring) for CPU spikes or memory leaks in microservices containers
- Check Redis hit/miss ratios to ensure caching is effective and not causing latency
- Monitor Spring Batch jobs (usually nightly or intra-day) for inventory syncs or pricing updates. Rerun failed steps and fix data anomalies preventing completion.
- Monitor the replication agents between AEM (Author/Publish) and the commerce engine. Ensure content fragments (images, banners) are rendering correctly on the Node.js front end.
- Release Gatekeeping: Review deployment plans before they go to production to assess risk.
- Troubleshoot Jenkins build failures
- Assist with Git merge conflicts or branching strategies during hotfix creation
- Configuration Management: Manage environment variables and secrets in GCP (Secret Manager) or Spring Cloud Config to ensure lower environments match Production configurations where appropriate
Knowledge Management: Create and update "Runbooks" and Knowledge Base (KB) articles for L1/L2 support teams (e.g., "How to restart the Apache web server safely" or "Steps to clear the AEM Dispatcher cache").
- Performance Tuning:
- Identify slow API endpoints in Spring Boot.
- Recommend index improvements for MySQL or MongoDB based on slow query logs.
- Capacity Planning: Review traffic trends (e.g., upcoming holiday sales) and recommend scaling strategies for GCP pods
Actively participate in the handoff process from the outgoing vendor, ensuring comprehensive understanding and documentation of existing data pipelines, configurations, and support procedures.
- Identify potential legal risks and develop strategies to mitigate them.
- Conduct regular audits to ensure ongoing compliance and process efficiency.
- Minimum of 12 years relevant work experience
- Primary required Tech Stack: Spring boot Micro services, Spring batch, Node.js
- Secondary required Tech-Stack: Apache, Redis, Mysql, MongoDB, Kafka, Git, Jenkins, Adobe Experience Manager.
The annual salary for this position is between $100,890 – $186,500 depending on experience and other qualifications of the successful candidate.
This position is also eligible for Cognizant’s discretionary annual incentive program, based on performance and subject to the terms of Cognizant’s applicable plans.
Benefits: Cognizant offers the following benefits for this position, subject to applicable eligibility requirements:
- Medical/Dental/Vision/Life Insurance
- Paid holidays plus Paid Time Off
- 401(k) plan and contributions
- Long-term/Short-term Disability
- Paid Parental Leave
- Employee Stock Purchase Plan