Vector registers can be initialized, loaded, and saved in a variety of ways. For optimal performance, it is critical that the local memory that is used to load or save the vector registers be vector-aligned.
AI Engine-ML has two 256-bit load and one 256-bit store units with aligned addresses.