In a previous post I told about the Kogge-Stone-style move generator I had implemented. Unfortunately, I forgot that I must stuff the generated moves somewhere in the shared memory. If a position takes 64 bytes, I have only 960 bytes left below the 1 kB limit. If a move takes 4 bytes, than that means that I can have at most 240 moves in the memory. Obviously, generating all the moves at once is not an option if I want to do more than 1 ply searches. I might have to use the one-move-at-a-time move generation routine that I mentioned earlier. I got it working on the CPU at least.
The main idea is to have an index or a pointer to where we left in during the move generation. We generate a move, make the move, call recursive search, unmake the move, and then generate the next move based on the pointer. The challenge is to construct the pointer so that it can be incremented to get the next move. There can be 16 pieces per side, so 4 bits determine the piece. Then there can be at most eight directions in which to move the piece (knights, queens, and kings). That takes 3 bits. Finally, the sliding pieces can go at most seven steps in one direction. That takes 3 more bits. So in total the pointer has 10 bits. With 10 bits one can represent 1024 numbers. That is a lot more than there are moves in a position. So skipping over the invalid pointer values is important for efficiency.
In addition, I have turned my attention to the 0x88-board representation. It is memory friendly, because it does not require large look-up tables. Basically, what is required, is a list of directions for each piece type. I am looking for an elegant solution to combine the 0x88-board representation with the pointer-based move generator. There are lots of interesting problems to solve.