Senior Devops Engineer- ML Engineering Support

Software Engineering | Bengaluru, India | ID: 10559 

     View more jobs

Senior Devops Engineer- ML Engineering Support

Teamwork makes the stream work.

 

Roku is changing how the world watches TV

Roku is the #1 TV streaming platform in the U.S., Canada, and Mexico, and we've set our sights on powering every television in the world. Roku pioneered streaming to the TV. Our mission is to be the TV streaming platform that connects the entire TV ecosystem. We connect consumers to the content they love, enable content publishers to build and monetize large audiences, and provide advertisers unique capabilities to engage consumers.

From your first day at Roku, you'll make a valuable - and valued - contribution. We're a fast-growing public company where no one is a bystander. We offer you the opportunity to delight millions of TV streamers around the world while gaining meaningful experience across a variety of disciplines.

 

About the Role

We are seeking a talented and experienced Senior Software Engineer, DevOps/SRE to join our dynamic team and play a critical role in supporting Machine Learning Engineering activities. The ideal candidate will have a strong background in DevOps practices, cloud infrastructure management, automation, and MLOps tooling, along with team leadership skills.

If you have a proven track record architecting and scaling ML/AI platforms, enjoy solving intriguing system challenges at internet-scale, are innovative at heart, and thrive in building infrastructure that accelerates ML experimentation and deployment — this role might be a great fit for you!


What You’ll Be Doing

  • Provide technical leadership and guidance to DevOps/SRE engineers supporting ML Engineering initiatives; mentor team members in best practices, technologies, and methodologies.
  • Design, implement, and maintain scalable and resilient cloud infrastructure (AWS & GCP) optimized for ML workloads, including GPU/TPU orchestration and distributed training.
  • Partner with ML Engineers to streamline the end-to-end ML lifecycle: data ingestion, feature engineering, training, evaluation, deployment, and monitoring.
  • Build and maintain CI/CD pipelines for ML applications and models using GitHub Actions, GitLab CI/CD, Argo, or Tekton.
  • Integrate with MLOps platforms (e.g., MLflow, Kubeflow, Airflow, SageMaker, Vertex AI) to ensure reproducibility and traceability of experiments.
  • Lead incident response efforts for ML-serving and training infrastructure, minimizing downtime and ensuring high availability.
  • Implement observability practices for ML workloads, including model performance monitoring, drift detection, and metrics via Prometheus, Grafana, and Datadog.
  • Collaborate with security and compliance teams to ensure adherence to data governance, PCI, SOX, and AI/ML data security standards.
  • Optimize system resources for large-scale ML jobs, including auto-scaling GPU clusters, cost optimization, and quota management.
  • Drive continuous improvement across DevOps + MLOps processes; proactively identify areas for enhancement.
  • Maintain clear documentation and foster a culture of knowledge sharing across DevOps, ML, and Data Engineering teams.
  • Participate in 24x7 on-call rotation, with availability to work with global teams in the event of critical outages.

We’re Excited if You Have

  • 8+ years of experience in DevOps/SRE roles, including at least 2–3 years supporting ML or data-intensive workloads.
  • Strong programming skills in Python or Go; experience building internal tools and automation for ML pipelines.
  • Hands-on experience with Kubernetes, Docker, ECS/EKS/GKE, and service mesh tools such as Istio or Envoy.
  • Familiarity with GPU/accelerator orchestration (NVIDIA GPU Operator, KubeFlow, Slurm, Ray, or similar).
  • Experience with Infrastructure as Code (IaC): Terraform, Helm, Ansible, or CloudFormation.
  • Deep understanding of distributed systems, microservices architecture, and cloud-native design patterns.
  • Exposure to MLOps tools: MLflow, Kubeflow Pipelines, Airflow, Argo, Vertex AI, or SageMaker.
  • Strong proficiency in cloud platforms (AWS and GCP required; Azure a plus).
  • Knowledge of data engineering concepts (object storage like S3/GCS, parquet/ORC, data versioning with DVC or Delta Lake).
  • Experience with networking, security, and compliance (role-based access, VPC design, encryption, auditing).
  • Demonstrated success in cross-functional collaboration with ML, Data, and Product teams.
  • Preferred certifications: Certified Kubernetes Administrator (CKA), AWS Certified DevOps Engineer, Google Professional Cloud DevOps Engineer, NVIDIA Deep Learning Institute courses.
  • AI literacy and curiosity, You have either tried Gen AI in your previous work or outside of work or are curious about Gen AI and have explored it. 
  • BS Degree in Computer Science or equivalent experience.

Benefits

Roku is committed to offering a diverse range of benefits as part of our compensation package to support our employees and their families. Our comprehensive benefits include global access to mental health and financial wellness support and resources. Local benefits include statutory and voluntary benefits which may include healthcare (medical, dental, and vision), life, accident, disability, commuter, and retirement options (401(k)/pension). Our employees can take time off work for vacation and other personal reasons to balance their evolving work and life needs. It's important to note that not every benefit is available in all locations or for every role. For details specific to your location, please consult with your recruiter.

 

The Roku Culture

Roku is a great place for people who want to work in a fast-paced environment where everyone is focused on the company's success rather than their own. We try to surround ourselves with people who are great at their jobs, who are easy to work with, and who keep their egos in check. We appreciate a sense of humor. We believe a fewer number of very talented folks can do more for less cost than a larger number of less talented teams. We're independent thinkers with big ideas who act boldly, move fast and accomplish extraordinary things through collaboration and trust. In short, at Roku you'll be part of a company that's changing how the world watches TV. 

We have a unique culture that we are proud of. We think of ourselves primarily as problem-solvers, which itself is a two-part idea. We come up with the solution, but the solution isn't real until it is built and delivered to the customer. That penchant for action gives us a pragmatic approach to innovation, one that has served us well since 2002. 

To learn more about Roku, our global footprint, and how we've grown, visit https://www.weareroku.com/factsheet.

By providing your information, you acknowledge that you want Roku to contact you about job roles, that you have read Roku's Applicant Privacy Notice, and understand that Roku will use your information as described in that notice. If you do not wish to receive any communications from Roku regarding this role or similar roles in the future, you may unsubscribe here at any time.

Apply   View more jobs

Thanks for considering a role at Roku. Take a moment to complete the form below. We ask that you remove any photos from your resume or CV before submitting your application. 

Additionally, providing false, misleading, or inaccurate information or responses will void this application and disqualify you from consideration. If employed by Roku, it will result in the immediate termination of employment regardless of when Roku discovers misleading or inaccurate information.

 

Application

Not You?

Thank you for applying for a role at Roku! We appreciate your interest in joining our team. We have received your application and will review it thoroughly.

Join our Talent Community
Finalize your job alert by selecting criteria from the dropdowns below. You can select multiple options from each dropdown by returning to the combobox and re-entering the list of options. Submit at the end to create your job alert.

Not You?

Thank you for your interest in joining our talent community at Roku! We appreciate your time and effort in submitting your contact information. We'll keep you updated on future opportunities that match your skills and experience.

Related Jobs

Senior Software Engineer - Frontend Embassy Golf Links Business Park, Domlur, Bengaluru, Karnataka
Sr. Analyst, SEC Reporting and Technical Accounting San Jose, California
Senior Software Engineer Embassy Golf Links Business Park, Domlur, Bengaluru, Karnataka
Senior Applied Machine Learning Engineer Milton, Cambridge, England
Senior Software Engineer, Embedded UI - C++ Oxford Road, Manchester, England
Senior Financial Analyst New York
Senior Account Executive, Mobile Gaming Santa Monica, California
Subscriptions Marketing Manager Santa Monica, California