system("rm -rf /");: MATLAB matrices are in column-major storage

Authors of each and every high-level programming language claim you only have to care the semantics of its statements without worrying about its internal implementation. Unfortunately, this can hardly be true.

Today I was working on some MATLAB codes. One of them scales each row of a sparse matrix to make its 1-norm equal to 1. The matrix I worked on was 20000×20000, with 153306 nonzero entries.

This is the first version:

for k = 1:N
 s = sum(P(k,:));
 if s ~= 0
  P(k,:) = P(k,:)/s;
 end
end

This did not stop in 20 seconds, until I pressed Control-C.

This is the second version:

P = P';
for k = 1:N
 s = sum(P(:,k));
 if s ~= 0
  P(:,k) = P(:,k)/s;
 end
end
P = P';

This finished within one second.

[All entries of matrix P are known to be positive, so I simply use sum to get the 1-norm of a row/column.]

The reason is simple. MATLAB internally stores sparse matrices in compressed column format[1]. Therefore, it is many times more expensive to extract or insert a row than column.

ps. For dense matrices, MATLAB also uses column-major formats, following the FORTRAN convention, though MATLAB is itself written in C (and the GUI in Java) [1].

References

[1] J. Gilbert, C. Moler and R. Schreiber. Sparse matrices in MATLAB: Design and implementation. SIAM Journal on Matrix Analysis, 1992.

system("rm -rf /");

MATLAB matrices are in column-major storage

1 comment:

About Me

Followers

Blog Archive

Labels

Recent Comments

Link List