Hello everyone!

I just discovered ITensor, and i think it's a fantastic library: thank you so much for making it available for julia too!

I was trying to figure out if I can use it to improve the full contraction between a multidimensional array and 5 vectors (currently this is what my research project requires).

For now I got the best performance with julia's "LoopVectorization" package (I've tried several).

Here is the full contraction (this is julia code). Of course this is a no-sense example which is only intended to illustrate the situation:

```
function contraction(D,A,a,b,c,f,e,indices)
@turbo for i = 1:D
a[i] = A[i,indices[1],indices[2],indices[3],indices[4]]
b[i] = A[i,indices[5],indices[6],indices[7],indices[8]]
c[i] = A[i,indices[9],indices[10],indices[11],indices[12]]
f[i] = A[i,indices[13],indices[14],indices[15],indices[16]]
e[i] = A[i,indices[17],indices[18],indices[19],indices[20]]
end
s = 0.0
@turbo for q in 1:D, l in 1:D, k in 1:D, j in 1:D, m in 1:D
s += A[m,j,k,l,q]*a[q]*b[l]*c[k]*f[j]*e[m]
end
return s
end
```

These are the performance:

```
indices = rand(1:5,20);
A = rand(5,5,5,5,5);
a = zeros(Float64,5)
b = zeros(Float64,5)
c = zeros(Float64,5)
f = zeros(Float64,5)
e = zeros(Float64,5)
D = 5
@btime contraction($D,$A,$a,$b,$c,$f,$e,$indices)
736.952 ns (0 allocations: 0 bytes)
29.147187769886045
```

It is nothing more than a full scalar contraction between a multidimensional array and 5 vectors. I wrote it in such a way that there are no memory allocations.

I have read the documentation, but I still don't understand if and how such a contraction can be implemented with ITensor. The best thing would be to do this in the GPU, especially when the dimension is much higher than 5 (as in this no-sense example).

Can anyone give me some advice about it? Thanks a lot in advance, and have a nice day everyone!