Key Responsibilities:· Configure log-based metrics such as error rate, latency, and memory usage, along with retention policies and export settings.· Monitor health metrics including latency, error rate, CPU/memory utilization, and API response times.· Track user traffic patterns and analyze system usage Create unified dashboards in Looker Studio or Grafana to present consolidated views of multiple services.· Integrate billing data, usage metrics, and error dashboards for holistic monitoring.· Define Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs).· Provide guidance and decision trees for choosing the appropriate GCP datastore (Cloud SQL, Cloud Spanner, Firestore, Bigtable, BigQuery) based on latency, consistency, schema flexibility, global scalability, and cost.· Advise on selecting between Kubernetes and Platform-as-a-Service (GKE, Cloud Run, App Engine, Cloud Functions) considering application complexity, control requirements, scalability, and team skill set Physical, mental, sensory or environmental demands may be referenced in an attempt to communicate the manner in which this position traditionally is performed
This position requires hands-on expertise in Google Cloud Platform (GCP), a strong DevOps foundation, and close collaboration with cross-functional teams
Required Skills:· Senior candidate with 10+ year of relevant experience· Experience developing architecture patterns for -Web application frontends, RESTful APIs, Backend services using Cloud SQL or Cloud Spanner, Stateless app deployments utilizing GKE, Cloud Run, or App Engine, CI/CD integration points, caching layers such as Redis, and queue-based microservices with Pub/Sub or Cloud Tasks· Set up comprehensive monitoring and logging solutions for Merchandising Apps, leveraging Cloud Monitoring, Logging, Error Reporting, and Trace Implement centralized dashboards using Cloud Monitoring Workspace or Grafana with a BigQuery backend.· Create log sinks to BigQuery or Pub/Sub for data processing and retention.· Develop reusable Terraform modules for - Compute resources (GCE, Cloud Run, GKE), IAM policies, Projects, Service Accounts, Cloud SQL and Spanner databases, VPCs, Subnets, Load Balancers, Firewalls, Maintain GitOps workflows for Terraform.· Design and deploy CI/CD pipelines using Cloud Build and GitHub Actions.· Design and implement VPC and Shared VPC architectures.· Configure and manage Load Balancers (Global, Regional, HTTPS, Internal).· Set up VPC Service Controls to protect against data exfiltration
It is not typical for candidates to be hired at or near the top of the posted compensation range.In addition to base salary, this role may be eligible for additional compensation such as variable incentives, bonuses, or commissions, depending on the position and applicable laws
In the U.S. and Canada, available benefits are determined by local policy and eligibility and may include: Paid time off based on employee grade (A-F), defined by policy: Vacation: 12-25 days, depending on grade, Company paid holidays, Personal Days, Sick LeaveMedical, dental, and vision coverage (or provincial healthcare coordination in Canada)Retirement savings plans (e.g., 401(k) in the U.S., RRSP in Canada)Life and disability insurance
Important Notice: Compensation (including bonuses, commissions, or other forms of incentive pay) is not considered earned, vested, or payable until it becomes due under the terms of applicable plans or agreements and is subject to Capgemini’s discretion, consistent with applicable laws
$86'900 - $192'460