Engineer II, Observability & Reliability
Pilot Flying J is the 10th largest privately held company in North America with more than 28,000 team members. As the industry-leading network of travel centers, we have 750+ retail and fueling locations in 44 states and six Canadian provinces. Our energy and logistics division is a top supplier of fuel, employing one of the largest tanker fleets and providing critical services to oil operations in our nation's busiest basins. Pilot Company supports a growing portfolio of brands with expertise in supply chain and retail operations, logistics and transportation, technology and digital innovation, construction, maintenance, human resources, finance, sales and marketing.
Founded in 1958, we are proud to be family owned and consider our team members to be part of the family. Our founding values, people-first culture and commitment to giving back remains true to us today. Whether we are serving guests, a fellow team member, or a trucking company, we are dedicated fueling people and keeping North America moving.
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status.
Pilot Flying J is part of the Pilot Company family of brands that keeps North America's drivers moving, including E-Z Trip, Mr. Fuel, One9 Fuel Stop, Pride, StaMart and Xpress Fuel.
Do you like to work with existing and new software product development teams? This position is to instrument end-to-end observability and visibility for business-critical systems with log ingestion, metrics, and traces. You will function as a site reliability engineer (SRE) that will collaborate with product teams, infrastructure SMEs, DevOps engineers, and the proactive monitoring team to provide unique dashboards of germane service level analytics for various product stakeholders.
- Work closely with software product development teams (ITSO, Product Owner, SME) to implement monitoring & observability instrumentation within their platforms.
- Drive adoption of best practices in monitoring, alerting, automation, and site reliability.
- Lead/contribute to engineering efforts from design to implementation focusing on instrumentation of logs, metrics, and traces.
- Drive use of automation in software instrumentation as well as in response to service degradation events.
- Identify and execute on opportunities to implement instrumentation in pre-production environments.
- Proactively pursue continuous improvement and expansion in observability coverage, service reliability best practices, incident management, and problem management.
- Advanced Splunk experience and technical proficiency required.
- Computer science degree preferred
- 5+ years IT related experience, preferably in devops, sys admin, and/or developer role.
- 3+ years cumulative experience in the following technologies: Splunk/ITSI, AWS CloudWatch, APM (AppDynamics), Solarwinds, Grafana, Prometheus, or similar.
- 2+ years experience in service oriented architecture (SOA), microservices, and/or api network design paradigm.
- Working knowledge of software development using modern programming languages such as C#/VB (.net core), Python, Go, etc...
- Working knowledge of network protocols/technology, databases, and application servers and their roles in service delivery.
- Experience using cloud native technologies (Kubernetes, open telemetry, GitHub, etc ..) in a production environment.
- Nation-wide Medical Plan/Dental/Vision
- Employee Fuel Discount
- 401(k) and Flexible
- Spending Accounts
- Adoption Assistance
- Tuition Reimbursement
- Weekly Pay
- Team Member Fuel Discount
- All your information will be kept confidential according to EEO guidelines