Build a Vector from an Iterator

I meant to give a shot at issue 731 on github to get my hands into nalgebra. I thought it could be completed with the computation of a permutation matrix that solves a vector. I have encountered the need before in an algorithm that prescribed to reorder a basis according to the images of a certain vector. Nothing complicated, it works well for a fixed dimension, but making it generic seems troublesome.

I merely reproduced the computation suggested in issue #731, that is Vector to Iterator to Vec to Iterator, and then meant to turn it back to a Vector.

fn argsort<N: Scalar, D: Dim, S: Storage<N, D>>(v: &Vector<N, D, S>) -> Vector<N, D, S>
where
    N: Ord,
{
    let v_as_vec = Vec::from_iter(v.iter().enumerate());
    v_as_vec.sort_by_key(|(_, &val)| val);
    let argsort = v_as_vec.into_iter().map(|(pos, _)| pos);
    Vector::from_iterator(argsort)
}

That’s how it happened:

error[E0034]: multiple applicable items in scope
  --> src/main.rs:30:13
   |
30 |     Vector::from_iterator(argsort)
   |             ^^^^^^^^^^^^^ multiple `from_iterator` found
   |
   = note: candidate #1 is defined in an impl for the type `nalgebra::Matrix<N, R, C, <nalgebra::DefaultAllocator as nalgebra::allocator::Allocator<N, R, C>>::Buffer>`
   = note: candidate #2 is defined in an impl for the type `nalgebra::Matrix<N, R, nalgebra::Dynamic, <nalgebra::DefaultAllocator as nalgebra::allocator::Allocator<N, R, nalgebra::Dynamic>>::Buffer>`
   = note: candidate #3 is defined in an impl for the type `nalgebra::Matrix<N, nalgebra::Dynamic, C, <nalgebra::DefaultAllocator as nalgebra::allocator::Allocator<N, nalgebra::Dynamic, C>>::Buffer>`
   = note: candidate #4 is defined in an impl for the type `nalgebra::Matrix<N, nalgebra::Dynamic, nalgebra::Dynamic, <nalgebra::DefaultAllocator as nalgebra::allocator::Allocator<N, nalgebra::Dynamic, nalgebra::Dynamic>>::Buffer>`

I sort of tracked this to from_iterator_generic that is defined as Self::from_data(DefaultAllocator::allocate_from_iterator(nrows, ncols, iter)).

Is there anything to do from there? I feel like the documentation didn’t prepare me for this kind of error there. (of course the argsort could just return a mere Vec, trying to figure things out there).

Hi!

Generic programming with nalgebra is not very easy yet due to the lack of specialization and const-generics. Here is a working version of the code you are attempting to write:

fn argsort<N: Scalar, D: Dim, S: Storage<N, D>>(v: &Vector<N, D, S>) -> VectorN<usize, D>
where
    N: Ord,
    DefaultAllocator: Allocator<usize, D>
{
    let mut v_as_vec = Vec::from_iter(v.iter().enumerate());
    v_as_vec.sort_by_key(|(_, val)| val.clone());
    let argsort = v_as_vec.into_iter().map(|(pos, _)| pos);
    Vector::from_iterator_generic(v.data.shape().0, v.data.shape().1, argsort)
}

You can see a few differences with your version:

  • The return value is VectorN<usize, D>. First the scalar type must be usize because that’s what you are returning (you are note returning a vector of N). Second, we use a VectorN here because the result type cannot have the storage type S. This is caused by the fact that S can be anything, including a non-owned storage (that contains a pointer to another vector). Using VectorN lets nalgebra automatically deduce the correct owned storage.
  • I added the trait bound DefaultAllocator: Allocator<usize, D>. This tells the compiler how to allocate a vector with scalar type usize and dimension D. This is what makes the vector construction work for both stack-allocated and heap-allocated vectors.
  • I used the Vector::from_iterator_generic(...) instead of from_iterator. We must use the _generic version of the constructor because the dimension of the vector have a generic type D. Because this D is generic, the vector could very well have a dynamic size or a static size. To handle both cases generically the _generic constructor will take the generic dimensions of the vector as its parameters.

This version of the code works but could be better. By calling Vec::from_iter you are performing a heap allocation of a Vec. This is inefficient, especially if v is a low-dimensional statically-sized vector. A better solution is to build the result VectorN right at the beginning, and sort it directly, instead of using the intermediate Vec:

fn argsort<N: Scalar, D: Dim, S: Storage<N, D>>(v: &Vector<N, D, S>) -> VectorN<usize, D>
where
    N: Ord,
    DefaultAllocator: Allocator<usize, D>
{
    let (nrows, ncols) = v.data.shape();
    let mut argv = Vector::from_iterator_generic(nrows, ncols, (0..nrows.value()));
    argv.as_mut_slice().sort_by_key(|i| v[*i].inlined_clone());
    argv
}

As you can see, we start by creating a Vector with the same dimensions as the input, filled with [0, 1, 2, 3, etc.]. Then we sort this vector by indexing v to get the sorting key. Finally, the .inlined_clone() is just like .clone() except that it will be inlined even in debug mode.

With this solution, no heap allocation will be executed when the input vector has a fixed dimension.

1 Like