What Does A Site Reliability Engineer Do (including Their Typical Day at Work)

Alyssa OmandacCareer, Overview

Salary, Job Description, How To Become One, and Quiz

Site Reliability Engineers

Site Reliability Engineers are responsible for developing and managing software systems, typically focusing on systems used for maintaining local or public websites. They also bridge the gap between software development and business operations.

Salary
$117000
Education
Bachelor's degree
Personality


Site Reliability Engineers are Software Engineers who specialize in site reliability engineering (SRE). SRE is used to manage and develop software systems that are reliable and scalable. SRE began at Google in 2003 when the company needed Software Engineers to make their large-scale sites more reliable and efficient. The result was so successful that other big tech companies soon started following the same practices.

Site Reliability Engineers are now used to develop more efficient software systems for businesses in almost every industry. They may develop better systems for communication, manufacturing, healthcare, and many other areas of business.

What they do

Site Reliability Engineers are responsible for developing and managing software systems, typically focusing on systems used for maintaining local or public websites. They also bridge the gap between software development and business operations.

Develop Efficient, Scalable Software Systems

The main duty of a Site Reliability Engineer is to develop efficient, scalable software systems. They work on projects to help ensure that IT products are easy to use and highly adaptable. These systems should require minimal input and operational procedures.

The software systems that Site Reliability Engineers typically work on include websites. This may include local sites that are part of an organization’s intranet and intended only for internal use. They may also work on public-facing websites or online services. Along with sites, Site Reliability Engineers help develop more efficient cloud computing solutions, IT infrastructure, or software as a service (SaaS) solutions).

Automate Tasks to Streamline Software Systems

Site Reliability Engineers often develop automated processes to increase the efficiency of software systems. They add automated scripts to existing software to reduce the need for human involvement. The most efficient systems do not require any manual input.

Perform On-Call Monitoring of Software Systems

Site Reliability Engineers may deal with on-call monitoring and business operations. They address any technical issues that arise and walk end-users through troubleshooting scenarios. These tasks are also completed by other IT professionals. However, Site Reliability Engineers engage in on-call duties to learn more about the performance of the software that they work on.

Some organizations limit the amount of time devoted to operations. Site Reliability Engineers spending a significant amount of time on operations and monitoring indicates a lack of efficiency and scalability. When the software solutions perform as expected, Site Reliability Engineers should not need to focus on operations, as their main duties include improving and designing more efficient systems.

Collaborate with Developers to Design New Solutions

When a project requires the development of new software, Site Reliability Engineers typically collaborate with other IT professionals, including Software Developers and Software Engineers. They may also work with the DevOps team, as SRE is often used as part of a DevOps model. Depending on the size of the company, Site Reliability Engineers may work with Release Engineers after the software is developed. They work to ensure that the software meets expectations.

Create Software to Support Other Departments

Site Reliability Engineers may work on the development of software systems intended to support other departments and teams within the organization, such as the IT operations team or the IT support teams. This includes the creation of automated scripts and monitoring software.

Complete Post-Incident Reviews

After a project, Site Reliability Engineers need to complete a post-incident review. This allows them to analyze whether their solutions were effective at reaching specific business objectives. They may also compile reviews from other members of the IT department. The post-incident reviews provide Site Reliability Engineers with the information needed to optimize the reliability and efficiency of the software.

What is the job like

Pros

You Can Help Make Life Easier for Others

The work completed by Site Reliability Engineers simplifies the responsibilities of other IT professionals who need to deal with monitoring and on-call duties.

You Get to Improve Existing Software

Many Site Reliability Engineers get significant job satisfaction from improving existing software and seeing the results of their work.

You Can Transition to Other Careers

If you tire of working in the SRE field, you can use your skills to find work in other areas of IT. For example, you may become a Software Developer at a company that does not use SRE practices.

You Are Constantly Learning New Things

Working as a Site Reliability Engineer is intellectually stimulating due to constant exposure to new technologies, software, and other IT professionals.

Cons

No One Understands Your Job

You may struggle to explain what you do in a way that others can understand when making small talk.

Your Solutions May Not Always Be Applied Properly

Site Reliability Engineers may take the blame when a solution fails to produce satisfactory results, even when the reason for failure is due to the solution being misapplied.

Where they work

Information Technology
Healthcare
Manufacturing
Government Agencies


The information technology (IT) industry is the largest employer of Site Reliability Engineers, with many working for companies such as Apple, Google, and Oracle. Site Reliability Engineers are also needed in the healthcare industry to improve websites needed for managing patient records or communicating with other healthcare facilities. Large manufacturers may employ Site Reliability Engineers to improve the websites needed for sharing data between manufacturing plants and vendors. Government agencies employ Site Reliability Engineers to improve the efficiency of sites needed for public services.

How to become one

Step 1: Explore Programming in High School

Aspiring Site Reliability Engineers should begin working on their coding and programming skills in high school. Take computer classes and programming classes at school or online.

Step 2: Earn a Bachelor’s Degree

Site Reliability Engineers need at least a Bachelor’s degree. Most employers prefer to hire candidates with degrees in Computer Science, Software Development, or related fields of study.

Step 3: Look for Entry-Level Work

Many Site Reliability Engineers start as Software Developers, Software Engineers, or System Administrators. You will likely need several years of experience before qualifying for an SRE position.

Step 4: Gain Voluntary Certifications

Certifications are voluntary but some employers may expect candidates to hold certifications in specific technologies. One of the first certifications that SREs obtain is the SRE Foundation (SREF) certification, which is offered through a variety of learning institutions.

Step 5: Apply for SRE Positions

After gaining work experience and several certifications, start applying for Site Reliability Engineer positions.

Should you become one

Best personality type for this career

The Thinker

People with this personality likes to work with ideas that require an extensive amount of thinking. They prefer work that requires them to solve problems mentally.

You can read more about these career personality types here.

Site Reliability Engineers are often detail-oriented, as they need to pay attention to minute details to tweak the efficiency of the software. Working in this field also requires curiosity, which helps Site Reliability Engineers look for solutions in unexpected places.

Patience is useful for Site Reliability Engineers when monitoring software or walking people through troubleshooting steps. Site Reliability Engineers should also need good verbal and written communication skills to collaborate with IT professionals.

Take this quiz to see if this is the right career for you.

FAQ


Don’t know which career to pursue?

Take the career quiz to find careers that match your personality type.

Take The Career Quiz