Job Description
Requirements:
Have a positive approach and work to enable and support those around you
Works effectively in a team-based agile environment to monitor, log, resolve, and
escalate infrastructure issues
Continually look for opportunities to develop solutions through automation;
participates in teams dedicated to continuous improvement
You have worked as a DevOps or Site Reliability Engineer (or similar position) for
atleast 6-8 years
Excellent critical, system-level thinking
5+ years of experience with Amazon Web Services
Knowledge of Cloud and System State automation tools (Chef, Puppet, Ansible,
CloudFormation, Terraform)
In-depth experience with Linux and strong networking comprehension
Experience with scripting languages (bash, python, ruby)
Experience with productivity tools and workflow models such as Jira, Scrum/Kanban, Confluence, Request Tracker, Asana, etc
Great troubleshooting skills with the ability to diagnose issues quickly on the fly
Current with industry trends and best practices
Time and project management skills; able to prioritize and task switch as needed
Team player and collaborator Strong communication skills; ability to communicate
in clear, concise, unambiguous terms when documenting and troubleshooting
Experience in software development is a plus
Bonus Skills:
Familiarity with container orchestration services, especially Kubernetes
Prior experience with Docker
Experience administering and deploying development CI/CD tools such as Git,
Jira, GitLab, Jenkins, GoCD, etc
ISO 27001, security management protocol, intrusion detection, SOC 1/2 or
SSAE16
Requirements
Responsibilities:
Maintain and build Bynder's global scaling infrastructure
Troubleshoot and debug network, system, and application issues using tools such
as New Relic, Sumologic, packet capture data, and the Linux shell.
Help the Development team in their workflow and streamline releases
Advocate operational best practices to Product, Professional Services,
Development, and Support teams
Drive process and company-wide communication, including post-mortems,
incident management, and project documentation
Drive automation, monitoring and horizontal scalability of key systems
Address high availability concerns, weak points, performance bottlenecks,
manually configured state, and information security issues
Provide feedback and guidance on architecture proposals from across the
organization
Ensure that proactive and efficient monitoring is in place for all our vital
microservices
Key skill Required
- Software Development
- Project Management
- Architecture
- Python
- Networking
- Automation
- Ansible
- Confluence
- CI/CD
- Agile Environment
- Amazon Web Services
- Bash
- Communication
- Development
- Documentation
- Git
- GitLab
- Guidance
- High Availability
- Incident Management
- Information Security
- Infrastructure
- Jenkins
- JIRA
- Kanban
- Kubernetes
- Linux
- Management
- Microservices
- Orchestration
- Packet Capture
- Proactive
- Productivity
- Professional Services
- Project Documentation
- Scalability
- Scaling
- Security
- Security Management
- Support Teams
- Team Player
- Terraform
- Troubleshooting
- Workflow