CIT Group Inc.

  • VP, DevOps Cloud Site Reliability Engineer SRE

    Location US-FL-Jacksonville
    Job ID
    31750
    # Positions
    1
    Job Family
    Information Technology - IT Architecture
    Type
    Full-Time
  • Overview

    Founded in 1908, CIT (NYSE: CIT) is a leading national bank empowering businesses and personal savers with the financial agility to navigate their goals. We believe in helping customers turn their ideas into outcomes. Whether those customers are building a business or building their savings, CIT has the experience and agility to empower them to achieve their goals. At CIT, how we do business is just as important as what we do. Our social responsibility programs focus on driving financial and personal empowerment, supporting the environment and advancing wellness. CIT contributes to communities where we live, work and do business through charitable donations, community investments and employee volunteerism.

    Responsibilities

    We're building a world-class SRE team to deploy, manage, and support the cloud infrastructure we need to scale our internally critical and externally-visible systems. We are looking for a talented engineer with Site Reliability Engineering (SRE) or DevOps Engineering experience to combine software and systems engineering to design, develop, and deliver large-scale, massively distributed, fault-tolerant systems while leveraging proven DevOps methodologies and best practices. This Engineer will use a “Learn-Do-Teach” approach to help themselves and the team continuously improve and share in order to add value to our customers.

    Qualifications


    • Ensure our services meet stability, performance and availability requirements
    • Develop and maintain performance, scalable, and maintainable software solutions and tools for internal use
    • Perform proactive troubleshooting and performance analysis of internal services and cloud environments
    • Build robust, self-healing features and automation that reduce operational effort and improve service up-time
    • Deploy and maintain production cloud environments that requires 24/7 availability
    • Participate in programs to deploy pre-GA products/codes in production and provide direct feedback to product development teams
    • Interact with other product and development teams, gather requirements, and perform analysis to determine appropriate solutions
    • Actively engage in design reviews, code reviews, and operational reviews
    • Engage in and improve the whole lifecycle of services from inception and design, through deployment, operation and refinement
    • Provide support services before “go live” through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews
    • Maintain services once they are live by providing metrics, measuring and monitoring availability, latency and overall system health
    • Scale systems sustainability through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity
    • Practice sustainable incident response and blameless postmortems
    Qualifications:
    • 2+ years SRE, DevOps or equivalent experience engineering and automating a modern cloud environment at scale, such as AWS, GCP, or Azure
    • Bachelor’s or master’s degree preferably in Computer Science, Informatics, Mathematics, or equivalent work experience
    • Minimum 7 years of experience with scripting and automation
    • Strong experience with DevOps pipeline tools – Azure DevOps, Octopus Deploy, TeamCity, Jenkins
    • Hands on experience with IaaS (VMware vSphere), PaaS (Azure App Service, Azure SQL, Redis), and SaaS (Salesforce)
    • Experience with containers and container orchestrators - Docker, Kubernetes, Mesos
    • Significant experience with Desired State Configuration management tools (Puppet, Azure DSC, Ansible)
    • Experience with Build, test, and dependency automation (Maven, NPM, Nuget, Bower, Grunt, Gulp, Selenium, VSUnit, Pester, Postman, JMeter, etc.)
    • Familiarity with Cybersecurity Standards (NIST SP 800-53 and STIGs)
    • Experience with modern web technologies such as HTML5, CSS, XML, REST, JavaScript, ReactJS, and Node
    • Experience with different programming languages and frameworks (C#, PowerShell, Python, Bash, etc.)
    • Proficient in Linux/Unix operating systems
    • Very good understanding of network basics - routing & switching, TCP/IP concepts, common protocols like DHCP and DNS, network analysis with tools like Wireshark and Fiddler
    • Knowledge of file systems, quotas, and data storage protocols including NFS, CIFS, FC, and iSCSI
    • Systematic problem-solving approach
    Strong pluses:
    • Experience designing and building custom tools and automation, such as CLIs and internal admin web apps
    • Experience designing for observability, utilizing modern monitoring, logging, and metrics tools
    • Experience with NSX, vSAN, vRealize Suite, and vCloud Director
    • Experience with relational databases (PostgreSQL, mySQL), NoSQL, or time-series databases
    • Experience with message brokers, stream processing, data transformation, and analytics
    • Knowledge and understanding of enterprise IT processes and ITIL
    • VMware Certified Professional (VCP) or higher (VCAP, VCDX)
    • Microsoft Certified: Azure DevOps Engineer Expert
    • Advanced degree or relevant technology certificates (VMware, Cisco, Juniper, Linux and EMC)
    • Experience with Cisco Nexus switches

     

    #LI-TD1

    Options

    Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
    Share on your newsfeed