Adfolks LLC has formally joined the Zain Tech family Learn more

Cloud Engineer

Remote
Years of Experience : 4+
Description:
  • Requirements
  • • Bachelor’s degree in Computer Science, related Engineering field, or equivalent experience
  • • 4+ years of experience in public cloud infrastructure, especially Azure and AWS.
  • • Good understanding of cloud infrastructure, and different deployment models
  • • Should be familiar with cloud networking and security solutions like load balancer, firewall, WAF, CSPM, security group, etc.
  • • Good understanding of identity and access management solutions like Active directory, Azure AD, conditional access, IAM and other vendor specific solutions
  • • Good understanding of Linux and windows based systems
  • • Understanding of SQL & NoSQL Databases including IAAS and PAAS models.
  • • Experience in policy management, governance, monitoring and alerts
  • • Knowledge in microservices, DevOps and IaC (Terraform and Ansible).
  • • Azure AZ-104 or AWS administrator certification would be an advantage
  • • Excellent communication and interpersonal skills
  • Job responsibilities
  • • Assist application team to deploy various solutions in the cloud environment.
  • • Maintain infrastructure security and governance as per the client requirement and standards.
  • • Support other team members (database, network, security, etc.) to configure and maintain respective solution.
  • • Actively Involve in discussions related to new solution implementation, design creation and all other discussions related to cloud infrastructure.
  • • POC deployment, documentation, and technical presentation.
Requirements:
  • Linux Hosting and Administration
  • • Install, configure, and maintain Linux servers, ensuring optimal performance and security.
  • • Handle Linux-based hosting solutions, including web servers, databases, and other services.
  • • Apply patches and updates to Linux servers as required, and automate routine tasks.
  • • Monitor system performance, troubleshoot issues, and conduct root cause analysis for any server downtime.
  • Kubernetes Operations
  • • Deploy, manage, and maintain containerized applications using Kubernetes.
  • • Create and manage Kubernetes manifests, helm charts, and operators for complex application architectures.
  • • Scale applications based on resource utilization and requirements.
  • • Monitor the health and performance of Kubernetes clusters and take corrective actions as needed.
  • DevOps Integration
  • • Implement and maintain CI/CD pipelines for automated testing and deployments.
  • • Assist in incorporating containerization and orchestration into the DevOps process.
  • Rancher/OpenShift Expertise (Nice to Have)
  • • Experience in deploying and managing Kubernetes clusters using Rancher or OpenShift.
  • • Implement monitoring, logging, and auto-scaling solutions in Rancher or OpenShift environments.
  • Application Support
  • • Gain a thorough understanding of the applications running within containers to provide first-level application support.
  • • Collaborate with development teams to debug application issues in staging and production environments.
  • Azure Infrastructure
  • o Deploy and manage resources on Azure, including but not limited to VMs, databases, and Kubernetes clusters.
  • o Implement Infrastructure as Code practices using Azure Resource Manager (ARM) templates or terraform
  • Monitoring and Alerting Using Open-Source Tools (Any one of the following)
  • ELK Stack
  • o Implement and manage the ELK (Elasticsearch, Logstash, Kibana) stack for real-time log aggregation, monitoring, and analysis.
  • o Customize Kibana dashboards for different system metrics and logs to aid in quick issue resolution.
  • • Grafana
  • o Develop and maintain Grafana dashboards to visualize key performance indicators and system metrics.
  • o Integrate Grafana with other data sources and monitoring tools for comprehensive analytics.
  • • Loki
  • o Set up and manage Loki for aggregating and storing logs.
  • o Integrate Loki with Grafana for unified querying and visualization of metrics and logs.
  • • Prometheus
  • o Deploy and configure Prometheus for monitoring system and application metrics.
  • o Create custom Prometheus queries and alerts to catch anomalies and system performance issues.
  • • Mimir/Cortex (prefereable)
  • o Implement Mimir or Cortex for enhanced long-term storage and scalability of Prometheus metrics.