Member of Technical Staff, Performance Modeling

netpreme· AI Systems
Apply Now ↗
📍 Santa Clara, CA or Boston, MAFullTime

About this role

About the Role

We are seeking a Member of Technical Staff, Performance Modeling to develop performance models for end-to-end ML systems based on our silicon products. The role focuses on building rigorous, decision-useful models that connect architectural choices to real end-to-end workload behavior.

You’ll work close with silicon architects and workload teams to explore design tradeoffs, validate performance assumptions, and identify bottlenecks early in the development cycle. This role is well-suited for engineers who enjoy reasoning from first principles, working with incomplete information, and co-exploring the design space as hardware and software evolve together.

This role will be performed onsite from one of our offices in Santa Clara, CA or Boston, MA.

Essential Duties & Responsibilities

  • Build and maintain system-level performance models for the rack-scale ML infrastructure with our custom silicon components.

  • Evaluate end-to-end value proposition on ML workloads, translating model outputs into actionable guidance for the silicon team.

  • Work day-to-day with silicon architects, system designers, and workload owners to align performance expectations and constraints.

  • Identify performance bottlenecks, scaling limits, and sensitivity points across compute, memory, and interconnects in end-to-end workload settings.

  • Clearly communicate modeling assumptions, limitations, and conclusions to both technical and non-specialist stakeholders.

Qualifications

  • Bachelor’s or Master’s degree in Electrical Engineering, Computer Engineering, or a closely related field.

  • Perfect understanding of ML systems: workload sharding, KV caching hierarchies, attention optimizations, trade-offs when deploying ML models at scale and various assumptions.

  • Ability to learn quick new ML architectures as soon as they come out, and build performance models for them.

  • 5–10+ years of experience in performance modeling for computer architectures, accelerators, or high-performance networking systems.

  • Ability to reason across multiple abstraction layers, from architectural details to system-level performance behavior.

Preferred Qualifications

  • PhD in Computer Science, Electrical Engineering, or a related field.

  • Prior experience modeling performance for ML accelerators and/or ML systems broadly.

  • Familiarity with shared memory systems and frameworks (e.g. CUDA VMM).

  • Experience with scale-up and high-bandwidth interconnects (e.g. NVLink or similar technologies).

Compensation & Benefits

  • Competitive salary commensurate with experience including base salary, performance-based bonus, and early stage equity grant

  • Comprehensive benefits including health, dental, vision, and life insurance

  • Well-equipped, sunny offices in Santa Clara, CA and Boston, MA

  • Relocation assistance and visa sponsorship

  • Perks include a daily lunch stipend, 401k match, and more

  • A collaborative, continuous-learning work environment with smart, dedicated colleagues engaged in developing the next generation of architecture for high-performance computing

The Opportunity

  • Impact: We are tackling a fundamental challenge at the infrastructure layer: unlocking greater AI capability while dramatically improving efficiency. The work we do here compounds across state-of-the-art AI models, systems, and real-world applications.

  • Timing: Joining now means real ownership of the company and meaningful influence over product direction and execution. You’ll work from first principles, move quickly from insight to execution, and see your contributions directly reflected in what we build.

  • Culture: You’ll work alongside a group of people who care deeply about rigor, clarity, and impact. We value thoughtful disagreement, fast learning, and intellectual fearlessness. This is a place where strong ideas shine, curiosity is encouraged, and growth is a daily practice.

Frequently Asked Questions

Is the salary disclosed for the Member of Technical Staff, Performance Modeling position at netpreme?
The salary for this Member of Technical Staff, Performance Modeling role at netpreme is not publicly listed. Click "Apply Now" to learn more about the compensation package on their official careers page.
Where is the Member of Technical Staff, Performance Modeling position at netpreme located?
This Member of Technical Staff, Performance Modeling role at netpreme is based in Santa Clara, CA or Boston, MA. The position is listed as on-site or hybrid. Check the full job description or apply directly to confirm the work arrangement.
Is the Member of Technical Staff, Performance Modeling role at netpreme full-time or part-time?
This is listed as a FullTime position. It is posted as a Member of Technical Staff, Performance Modeling role in the AI Systems department at netpreme.
Which team or department does the Member of Technical Staff, Performance Modeling at netpreme belong to?
This Member of Technical Staff, Performance Modeling position is part of the AI Systems department at netpreme. See the full job description for more information about the team structure and responsibilities.
How do I apply for the Member of Technical Staff, Performance Modeling position at netpreme?
Click the "Apply Now" button on this page. You will be redirected to netpreme's official application portal hosted on ashby where you can submit your application directly.
When was the Member of Technical Staff, Performance Modeling job at netpreme posted?
This Member of Technical Staff, Performance Modeling position at netpreme was posted on Feb 6, 2026. Apply as soon as possible — early applications are often reviewed first.
Member of Technical Staff, Performance Modeling
netpreme
Apply for this role ↗

You'll be redirected to netpreme's official application page on Ashby ATS.