Build the optimization layer for the next generation of AI.
We're a small team solving a hard infrastructure problem. If you've run GPU workloads at scale and know the pain firsthand, we'd love to talk.
We ship to real clusters.
Every line of code runs on production GPU infrastructure. No toy benchmarks, no sandboxed demos — we operate where it matters.
Operators stay in the loop.
We build tools that augment human judgment, not replace it. Every recommendation is reviewable, reversible, and auditable.
Clarity over cleverness.
We explain what our system sees and why it recommends what it recommends. No black boxes — inside or outside the product.
Open roles
Founding Engineer — Platform
We're pre-seed and raising now. This is a founding-team role — equity-heavy, high-ownership, and for someone ready to build before the funding lands. You'll work directly with the founder to ship the core platform: GPU waste scanner, rules engine, and human-in-the-loop remediation workflows. Your code runs on production inference fleets and directly affects $/token costs and capacity decisions for our customers.
Responsibilities
- Design and build the piqc scanner — the Kubernetes-native component that discovers and inspects live inference workloads
- Develop and maintain the rules engine that encodes GPU optimization expertise and surfaces actionable recommendations
- Integrate with Temporal to implement durable, human-in-the-loop remediation workflows
- Build multi-cluster telemetry collection across vLLM, TGI, and other inference servers
- Own the full deployment lifecycle — from local dev to production GPU clusters
Qualifications
- You want to be on a founding team, not employee #40 — you're ready to jump in before the money is in the bank
- You think deeply about inference infrastructure — vLLM, Kubernetes, GPU economics, or fleet operations
- 5+ years of backend or platform engineering experience — you've built infrastructure tooling, not just used it
- Deep familiarity with Kubernetes — controllers, operators, and the scheduling layer
- Experience running or building tooling for GPU workloads (inference, training, or HPC)
- Familiarity with ML serving frameworks — vLLM, TGI, Triton, or similar
- Proficiency in Go or Python; comfort with both
- Strong opinions about observability, reliability, and operational correctness
Solutions Engineer
Work directly with GPU cloud providers and inference platform teams to onboard, deploy, and get value from Paralleliq. You're the bridge between product and customer.
Responsibilities
- Lead technical onboarding for new customers — from cluster access to first recommendation surfaced
- Diagnose GPU waste patterns in customer environments and translate findings into actionable insights
- Work with the engineering team to close product gaps discovered during customer deployments
- Build repeatable onboarding playbooks and technical documentation
- Run discovery calls and technical demos with prospects at GPU cloud providers and enterprise AI teams
Qualifications
- 3+ years in a solutions engineering, customer engineering, or technical account management role
- Hands-on experience with Kubernetes and cloud infrastructure (AWS, GCP, or Azure)
- Familiarity with AI/ML infrastructure — model serving, GPU utilization, inference optimization
- Ability to read and understand Python and YAML; light scripting for customer environments
- Strong communicator — equally comfortable in a Slack thread and a C-suite demo
- Experience working with early-stage products where the playbook doesn't yet exist
Apply
Don't see your role? Select “General Interest” and tell us what you've built.