AI Inference Engineer Intern - Model Pruning

ko1PkTVkDDBxkJkiACPdua· Software Engineering
Apply Now ↗
📍 Burlingame, California, United StatesTemporary

About this role

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code.

Note: Our preference is for this internship to be based out of our Burlingame, California office. Candidates should be based in the Bay Area or able to relocate for the internship period and available to work on site.

Responsibilities:
Model pruning: Prune the model to speed up inference with re-training to maintain accuracy.

  • MS student in CS or related fields.
  • Proficiency in Python
  • Experience with model pruning and training in PyTorch
  • Experience in quantization, and vision model accuracy metrics.

At Quadric, we value Integrity, Humility, and Happiness. What we expect from one another is simple and clear: Initiative, Collaboration, and Completion. We are a collaborative team focused on building something extraordinary in the edge computing space. 

The hourly rate for this temporary internship position is $45.00/hour to $60.00/hour. The actual rate offered will depend on a number of factors, including the specific level of the role, years and depth of relevant experience and education, technical skills and competencies, and work location. 

Quadric interns receive hands-on experience working alongside industry experts in AI and semiconductor technology, with access to mentorship and meaningful project ownership from day one.

Founded in 2016 and based in downtown Burlingame, California, Quadric is building the world’s first supercomputer designed for the real-time needs of edge devices. Quadric aims to empower developers in every industry with superpowers to create tomorrow’s technology, today. The company was co-founded by technologists from MIT and Carnegie Mellon, who were previously the technical co-founders of the Bitcoin computing company 21.

Quadric is proud to be an equal opportunity employer. We are committed to creating an inclusive environment where people from all backgrounds can do their best work. We consider all qualified applicants without regard to race, color, religion, sex, gender identity or expression, sexual orientation, national origin, age, disability, veteran status, or any other protected characteristic under applicable law.

If this role resonates with you, we encourage you to apply even if your experience does not perfectly match every qualification. We value potential, curiosity, and a willingness to learn just as much as direct experience. Skills and growth come in many forms, and we would love to hear your story.

By submitting an application, you acknowledge that Quadric will collect and process your personal information as part of the hiring process. Please review our Privacy Policy to understand how we handle your data.

Frequently Asked Questions

Is the salary disclosed for the AI Inference Engineer Intern - Model Pruning position at ko1PkTVkDDBxkJkiACPdua?
The salary for this AI Inference Engineer Intern - Model Pruning role at ko1PkTVkDDBxkJkiACPdua is not publicly listed. Click "Apply Now" to learn more about the compensation package on their official careers page.
Where is the AI Inference Engineer Intern - Model Pruning position at ko1PkTVkDDBxkJkiACPdua located?
This AI Inference Engineer Intern - Model Pruning role at ko1PkTVkDDBxkJkiACPdua is based in Burlingame, California, United States. The position is listed as on-site or hybrid. Check the full job description or apply directly to confirm the work arrangement.
Is the AI Inference Engineer Intern - Model Pruning role at ko1PkTVkDDBxkJkiACPdua full-time or part-time?
This is listed as a Temporary position. It is posted as a AI Inference Engineer Intern - Model Pruning role in the Software Engineering department at ko1PkTVkDDBxkJkiACPdua.
Which team or department does the AI Inference Engineer Intern - Model Pruning at ko1PkTVkDDBxkJkiACPdua belong to?
This AI Inference Engineer Intern - Model Pruning position is part of the Software Engineering department at ko1PkTVkDDBxkJkiACPdua. See the full job description for more information about the team structure and responsibilities.
How do I apply for the AI Inference Engineer Intern - Model Pruning position at ko1PkTVkDDBxkJkiACPdua?
Click the "Apply Now" button on this page. You will be redirected to ko1PkTVkDDBxkJkiACPdua's official application portal hosted on workable where you can submit your application directly.
When was the AI Inference Engineer Intern - Model Pruning job at ko1PkTVkDDBxkJkiACPdua posted?
This AI Inference Engineer Intern - Model Pruning position at ko1PkTVkDDBxkJkiACPdua was posted on May 22, 2026. Apply as soon as possible — early applications are often reviewed first.
AI Inference Engineer Intern - Model Pruning
ko1PkTVkDDBxkJkiACPdua
Apply for this role ↗

You'll be redirected to ko1PkTVkDDBxkJkiACPdua's official application page on workable.