How to use explicit simd in matrix multiplication?

I have a project that does extensive matrix and vector multiplication. I have implemented in C++ where this matrix multiplication is carried out using AVX512. I am now trying to port this over to rust and found nalgebra and the blog post about explicit simd with nalgebra (SIMD Array-of-Structures-of-Arrays in nalgebra and comparison with ultraviolet · The rustsim organization

I haven’t found any actual code examples of explicit simd matrix multiplication using nalgebra and so was trying to put one together. Below is the code that I have.

use nalgebra as na;
use na::{Matrix3x1};
use packed_simd::{f32x4};

fn main() {
    let m1: Matrix3x1<f32x4> = na::matrix![
                                        f32x4::new(1.0, 2.0, 3.0, 4.0);
                                        f32x4::new(5.0, 6.0, 7.0, 8.0);
                                        f32x4::new(9.0, 10.0, 11.0, 12.0);
                                   ];
    let mut m2: Matrix3x1<f32x4> = na::matrix![
                                        f32x4::new(1.0, 2.0, 3.0, 4.0);
                                        f32x4::new(5.0, 6.0, 7.0, 8.0);
                                        f32x4::new(9.0, 10.0, 11.0, 12.0);
                                    ];
    let m3 = m1+m2;
    println!("m3 is: {:?}", m3);

    let m4 = m2.transpose();
    println!("m2 shape is: {:?}", m2.shape());
    println!("m4 shape is: {:?}", m4.shape());
    println!("m1 shape is: {:?}", m1.shape());
    let m5 = m4*m1;
    println!("m5 is: {:?}", m5);
}

But I get the below compiler error

  --> src/main.rs:79:17
   |
79 |     let m5 = m4*m1;
   |                 ^^ expected struct `Simd`, found struct `Matrix`
   |
   = note: expected struct `Simd<_>`
              found struct `Matrix<Simd<_>, Const<3_usize>, Const<1_usize>, ArrayStorage<Simd<[f32; 4]>, 3_usize, 1_usize>>`

What am I doing wrong? Also, if the code that was used to create the linked blog post could be posted, that would be extremally helpful for examples.

I spoke with sebcrozet on the discord channgel and I was told that my error was in using the packed_simd crate explicity and that I needed to use the simba crate. Sure enough, that was my issue. The working version of the code is below.

use nalgebra as na;
use na::{Matrix3x1};
use simba::simd::f32x4;

fn main() {
    let m1: Matrix3x1<f32x4> = na::matrix![
                                        f32x4::new(1.0, 2.0, 3.0, 4.0);
                                        f32x4::new(5.0, 6.0, 7.0, 8.0);
                                        f32x4::new(9.0, 10.0, 11.0, 12.0);
                                   ];
    let m2: Matrix3x1<f32x4> = na::matrix![
                                        f32x4::new(1.0, 2.0, 3.0, 4.0);
                                        f32x4::new(5.0, 6.0, 7.0, 8.0);
                                        f32x4::new(9.0, 10.0, 11.0, 12.0);
                                    ];
    let m3 = m1+m2;
    println!("m3 is: {:?}", m3);

    let m4 = m2.transpose();
    let m5 = m4*m1;
    println!("m5 is: {:?}", m5);
}

with a cargo file containing the following line

simba= { version = "*", features = ["packed_simd"] }