Infrastructure · Networking · Security Engineer

AI-ready systems • CCNP (Enterprise + Automation) • NVIDIA AI Infrastructure

About

I build reliable infrastructure systems that scale under pressure. My focus is on network reliability, observability pipelines, and security systems thinking. I work at the intersection of AI-ready networking, automation, and operational resilience—designing systems that handle failures gracefully and optimize resources efficiently.

From operating enterprise L2/L3 networks to building AI-aware traffic optimizations and GPU observability correlators, I approach infrastructure as a cohesive system where compute, fabric, and defense layers must work in harmony.

Experience

Independent Infrastructure Consultant (NDA)

Sep 2022 – Present

Remote

Project-based infrastructure + automation work under NDA; AI-ready networking, observability, failure analysis, cost optimization.

  • AI-aware traffic simulations for congestion/latency risk
  • GPU/infrastructure observability correlating compute/network/workload
  • Resilient edge inference under constrained networks
  • Traffic-aware security detection for AI workloads
  • Failure injection testing for faster RCA
  • Cost driver modeling and optimization strategies

Security Network Engineer

Wipro Ltd • Jan 2021 – Aug 2022

Hybrid
  • Operated enterprise L2/L3 networks (BGP, OSPF, SD-WAN, VPNs)
  • Supported hybrid cloud networking (AWS VPCs, IPSec, firewall policy enforcement)
  • On-call escalation; incident response; RCA and permanent fixes
  • Automated config/validation workflows (Python, Terraform, PowerShell)
  • Change management and compliance constraints

Projects

AI-Aware Network Traffic Optimization

Fabric

Problem

AI training workloads create unpredictable traffic patterns leading to congestion and latency spikes.

Build

Traffic simulation engine analyzing GPU communication patterns, predicting congestion points, and optimizing routing decisions.

Outcome

Reduced latency spikes under simulated peak load through predictive traffic shaping experiments.

Stack

Python • Network Simulation • Traffic Analysis • BGP

GPU & Infrastructure Observability Correlator

Compute

Problem

Performance degradation in GPU clusters often stems from network or storage issues, not compute itself.

Build

Multi-layer observability pipeline correlating GPU utilization, network throughput, and workload characteristics.

Outcome

Improved incident pinpointing under load by correlating infrastructure and compute signals.

Stack

Prometheus • Grafana • NVIDIA SMI • Network Telemetry

Edge Inference Under Constrained Networks

Fabric

Problem

Edge inference nodes face unreliable connectivity and bandwidth constraints while requiring consistent SLAs.

Build

Adaptive inference system with intelligent model caching, request batching, and graceful degradation strategies.

Outcome

Kept service responsive in high loss simulations using resilient edge and routing policies.

Stack

ONNX Runtime • Redis • Network QoS • Failover Logic

Traffic-Aware Security Detection for AI Workloads

Defense

Problem

Traditional security tools generate false positives on AI workload traffic patterns (burst transfers, large payloads).

Build

ML-based anomaly detection system trained on legitimate AI traffic patterns with context-aware alerting.

Outcome

Lowered noisy alerts in validation runs while surfacing real threats on AI traffic patterns.

Stack

Zeek • Suricata • Machine Learning • Flow Analysis

Failure Injection Lab for Faster RCA

Compute

Problem

Teams struggle to identify root causes quickly during outages due to lack of failure pattern knowledge.

Build

Chaos engineering platform for systematic failure injection with automated symptom cataloging and playbooks.

Outcome

Reduced average incident resolution time from 4 hours to 45 minutes through documented failure patterns.

Stack

Chaos Mesh • Kubernetes • Terraform • Runbooks

Cost Driver Modeling for AI Infrastructure

Compute

Problem

Cloud AI infrastructure costs spiral without visibility into primary cost drivers and optimization opportunities.

Build

Cost attribution system mapping workload characteristics to resource consumption with actionable optimization recommendations.

Outcome

Surfaced cost drivers in modeling exercises, informing GPU sizing and network optimization choices.

Stack

FinOps • Cloud Billing APIs • Data Analysis • Optimization

Education

MSc / Graduate Diploma

Electronic & Computer Technology (IoT) • Dublin • 2022–2025

Capstone: Arcane Guard — AI-Driven Security for IoT Networks

ML/DL-based intrusion detection system with real-time pipeline processing, scalable architecture, and false positive reduction through multi-stage classification.

Bachelor of Science

Network & Technology

Certifications

CCNP Enterprise

In Progress

CCNP ENAUTO

In Progress

NVIDIA AI Infrastructure

In Progress

PNPT

Next

DevSecOps

Next

CCT

Next

Skills

AI Infrastructure

  • GPU Cluster Orchestration
  • NVIDIA Architecture (H100, A100)
  • InfiniBand / RDMA Networking
  • AI Workload Optimization
  • Model Deployment Pipelines
  • Resource Scheduling

Networking

  • BGP, OSPF, SD-WAN
  • Enterprise L2/L3 Networks
  • VPN (IPSec, WireGuard)
  • AWS VPC, Hybrid Cloud
  • Network Observability
  • Traffic Engineering

Automation

  • Python, Bash, PowerShell
  • Terraform, Ansible
  • CI/CD Pipelines
  • Infrastructure as Code
  • Configuration Management
  • Workflow Orchestration

Cloud & Observability

  • AWS, Azure, GCP
  • Kubernetes, Docker
  • Prometheus, Grafana
  • Log Aggregation
  • Distributed Tracing
  • Incident Response

Contact

Interested in infrastructure reliability, AI-ready networking, or systems architecture? Let's connect.