[bugfix] Use submatrix index vector to implement O(1) look-up
Looking up if a column index is contained in submatrix could degenerate to accumulated O(n) complexity for each row in the old implementation. By setting up a look-up before, this can be improved to O(1). Fortunately such a vector is constructed anyway so we can simple use it after minor reordering of code.
This should fix #54 (closed).