It's just not "normal practice", plus, to be honest, LDST already has LE/BE and byte-reverse. The issue is that we didn't think of this 6-12 months ago, it's only just come up, and there's no spare encoding space.Įffectively, considering the regfiles as an SRAM and allowing the data within them to be either LE or BE encoded is something that needs its own dedicated MSR bit. i started adding some of those at the sv/bitmanip page. What i am inclined to suggest here is that any kind of in-register byteswapping be performed explicitly by using bitmanip operations. Now, if your maxvl is 3, which pair of consecutive bytes in memory is guaranteed to have zeros, M or M?ĭoes the answer change if maxvl is 4, and but vl is still 3?ĭoes any of this conflict with any of the desirable properties you and I brought up?ĬRs (which caused merry hell to implement) being the guide here, i am very reticent to go down this route, not least because of the time pressure that we are under. If you take a zero-initialized vector, and use a byte-load instruction svp64-prefixed with ELWIDTH=h ELWIDTH_SRC=default to load and zero-extend each byte of a string into a half-word, and then store the registers holding the vector in memory M, should you get the string's bytes in M or M bytes? should it not depend on endianness? This does suggest that, in order to maintain the property I suggested, the position and iteration order of sub-register elements may have to be affected by vl, or even by both vl and maxvl, depending on endianness. this is unavoidable as long as you retain the property of in-memory indexing equivalent to that of arrays, which also fits in with the notion of using neighboring memory areas for neighbor vector elements, as the natural expansion of a sequential for-loop vector load or store would use. this does mean, however, that loading the vectors above from memory into a scalar 64-bit register will land element at opposite ends depending on endianness. GCC seems to regard vector types just like arrays, when it comes to memory layout, so indexing it operates like indexing arrays. then you compare both registers as scalars, and they should compare equal, is that what you mean, or is there more to it, i.e., something about their maintaining the same relative positions regardless of endianness? now you store that register in memory, and load it back, as a vector of 8 bytes, or as a vector of 4 halfwords, onto a different register.
to illustrate, say you have an 8-byte string, or a 4x16-bit (x,y,z,w) tuple held in a 64-bit register. The property I mentioned in comment 3 may seem a given for 64-bit ELWIDTH I failed to mention here that the concern was about sub-register element types.Īs for the property you wrote about in comment 4, jacob, I'm having some trouble figuring out just what you mean.