Site Reliability Engineer (SRE)

at JP CALIBER SDN BHD

Full time

**Job Title: Site Reliability Engineer (SRE)**

**Overview:**
As a Site Reliability Engineer (SRE), you will be responsible for ensuring the reliability, scalability, and performance of critical services. Your role bridges the gap between development and operations by implementing robust system architecture, automation, and proactive monitoring. You will focus on key SRE practices, including Service Level Objectives (SLOs), Service Level Indicators (SLIs), and reducing operational toil. Collaboration with cross\-functional teams will be essential in fostering a culture of continuous improvement and accountability.

**Key Responsibilities:**

Design and implement resilient system architectures to support high availability and scalability.
Develop automation tools and scripts to enhance operational efficiency and minimize manual effort.
Define, monitor, and analyze SLOs and SLIs to maintain system reliability and performance.
Conduct in\-depth post\-mortem analyses to identify root causes and implement long\-term solutions.
Collaborate with development and operations teams to establish best practices in reliability and incident management.
Troubleshoot and resolve issues related to database performance, network connectivity, and deployment failures, including Kubernetes and virtual machines.
Ensure adherence to Service Level Agreements (SLAs) by maintaining high service delivery standards.
Identify and address performance bottlenecks, providing actionable recommendations for system enhancements.
Maintain comprehensive documentation of processes, incident responses, and operational workflows.

**Qualifications:**

Proficiency in programming languages such as Python, Golang, or Java, with a focus on operational efficiency.
Strong experience in system architecture and design, emphasizing reliability and scalability.
In\-depth understanding of SRE principles, including SLOs, SLIs, toil reduction, and incident post\-mortems.
Hands\-on experience with cloud environments such as AWS, Azure, or Google Cloud.
Expertise in Linux system administration and troubleshooting application support issues.
Familiarity with networking concepts and effective troubleshooting techniques.
Excellent problem\-solving skills and a proactive approach to operational challenges.
Ability to work independently while effectively collaborating within a team environment.

**Preferred Skills:**

Experience with monitoring tools and performance optimization techniques.
Strong scripting and automation capabilities for system administration tasks.
Hands\-on knowledge of cloud platform services (AWS, Azure, Google Cloud).
Familiarity with DevOps methodologies, including CI/CD, infrastructure as code, and containerization.

Job Type: Contract

Pay: RM10,000\.00 \- RM14,000\.00 per month

Benefits:

Health insurance
Opportunities for promotion
Professional development

Schedule:

Afternoon shift
Rotational shift

Supplemental Pay:

13th month salary
Performance bonus
Yearly bonus

Work Location: In person

Salary

Location

Job Overview

Job Posted:

1 year ago

Job Expire:

1mo 1d

Job Type

Full time

Job Role

Total Vacancies

Site Reliability Engineer (SRE)

Salary

Location

Company

Candidate

Employer

Support

Job Details

Site Reliability Engineer (SRE)

Salary

Location

Share This Job:

Company

Candidate

Employer

Support