Performance Optimization Is Rarely the Hard Part.
Making It Scale Across Variants Usually Is.

February 25, 2026

The AURIX™ TC4x PPU enables vector accelerated computation for data intensive workloads.
In practice, however, scaling PPU optimized software across different data sizes, vector widths, and verification environments requires more than manual optimization.

Based on a joint project with TASKING, emmtrix worked on an automated, tool assisted workflow for generating and verifying PPU optimized functions.

The focus was on making PPU optimization scalable and predictable:

  • varying data dimensions
  • 256-bit and 512-bit PPU configurations
  • numerical accuracy
  • verification in simulation and on hardware, including timing and coverage


The key takeaway:
Scalable PPU optimization is not only about writing efficient code.
It is about establishing workflows that make performance repeatable across variants.

AI Workflow_Vectorization of ML Model
Cookie Consent with Real Cookie Banner