Site Reliability Engineers will have a deep understanding of how applications function and ensure crisp application function throughout their lifetime. They will be working closely with the business, the application development team, and the support teams on automation and will take ownership of problems, work to determine the solutions, and follow through on the resolution.
Each member will be directly aligned and embedded with specific businesses and work closely with a global team located in SG/ HK/ LD/ NY.
Responsibilities:
Implementing best practices to improve reliability, institute performance, and the software development process, including the buildout of testing environments, and the current and future risk architecture.
Deliver engineering solutions for ensuring the reliability of the platform, including mandating a BCP strategy and following sound DevOps practices.
Ensure the reliability, availability, and performance of production applications
Troubleshoot issues and discrepancies with the appropriate individuals, teams, or vendors
Assist with major incidents from identification and troubleshooting through service restoration
Qualifications:
Fully proficient in at least one modern structured programming language (Python is preferred, Java/ C++/ C#/ Scala beneficial..)
Experience troubleshooting ambiguous problems and performing root-cause analysis
Comfortable with a range of current software development tools and practices (testing, source control, build systems, CI/CD etc)
SQL and database experience is a strong plus.
Track record of supporting financial systems (ideally trading &/or risk systems) a plus.
Excellent written and verbal communication skills
Great learning potential with strong academic performance
Education:
Bachelor's Degree in Computer Engineering, Computer Science, or equivalent experience