Job Description
-Monitor applications and infrastructure to ensure optimal performance and reliability
-Implement effective log management practices to track and control application behavior.
-Document troubleshooting steps, runbooks, and procedures in an internal knowledge base to promote knowledge sharingwithin the team.
-Participate in an on-call rotation, addressing issues and resolving incidents as they arise Advocate for and implement best practices in cloud and application security.
*Must-Have:
-Experience with Linux, networking.
-Familiarity with at least one high-level programming language (e.g., Python or Golang
-Familiarity with monitoring and observability tools, such as Grafana, Prometheus.
-Experience with CI/CDprocesses and maintaining related pipelines.
-Strong understanding of source control systems (e.g., Git) and branching strategies.
-Familiarity with relational databases, such as PostgreSQL.
-Knowledge of code repositories and deployment tools
-A flexible, solutions-oriented approach to problem-solving.
*Nice-to-Have:
-Familiarity with Kubernetes
-Familiarity with Docker, Docker Compose
-Familiarity with ELK stack
-Familiarity with Helm charts and ArgoCD