Today I was working on some MATLAB codes. One of them scales each row of a sparse matrix to make its 1-norm equal to 1. The matrix I worked on was 20000×20000, with 153306 nonzero entries.
This is the first version:
This did not stop in 20 seconds, until I pressed Control-C.for k = 1:N s = sum(P(k,:)); if s ~= 0 P(k,:) = P(k,:)/s; end end
This is the second version:
This finished within one second.P = P'; for k = 1:N s = sum(P(:,k)); if s ~= 0 P(:,k) = P(:,k)/s; end end P = P';
[All entries of matrix P are known to be positive, so I simply use
sum
to get the 1-norm of a row/column.]The reason is simple. MATLAB internally stores sparse matrices in compressed column format[1]. Therefore, it is many times more expensive to extract or insert a row than column.
ps. For dense matrices, MATLAB also uses column-major formats, following the FORTRAN convention, though MATLAB is itself written in C (and the GUI in Java) [1].
References
[1] J. Gilbert, C. Moler and R. Schreiber. Sparse matrices in MATLAB: Design and implementation. SIAM Journal on Matrix Analysis, 1992.
S = sum(P,2);
ReplyDeleteS(S == 0) = 1;
P = P./S(:,ones(1,N));
(PS. On my machine, for N = 7000, non-sparse matrices, your original method took 4.513336 seconds; your second method 3.934098 seconds; this took 1.564700 seconds; all identical output.)