It’s not quite that simple, but it is certainly not provably correct either.
The thing is that the CC3000 will only interrupt as a result of chip select being asserted, and the current host driver basicly blocks spinning on variables, so there is no forward progress in the foreground task while first the CC3000 /INT signal is handled (it is simply a handshake to say the SPI bus is ready, not a normal asynchronous interrupt), then the DMA completion interrupt is handled at the end of the SPI transfer.
The external flash code is vanilla polled pure-foreground code.
So while there is no obvious way for the code to do anything other than run one or other foreground code path to completion, there is zero protection against different/illegal behavior.
My personal suspicion is that the lack of a global mutex around the entire CC3000 and flash SPI accesses may allow other events (e.g. triggered by timer ticks etc) to interject themselves mid-transaction.
My current plan is to add simple spin-lock mutex code around the two competing SPI transactions, at the whole transaction level, and see if that remedies the problem. However, first I need to create a reliable test-bed to reproduce the OTA failure in captivity, and I won’t be able to start that for 10+ hours.
In the meantime, if someone can post a scriptable, simple, linux-based test that triggers this problem - it would be a great help to all working this issue.
EDITED TO ADD:
I stand corrected, there are completely unsolicited events (all housekeeping in nature, nothing useful like “hey, there’s data on this here socket”), which are signaled by the CC3000 pulling IRQ low, so these are perfectly capable of clashing with an ongoing flash operation (erase, read, write, verify etc) - effects may also involve unpredictable/bad behaviour from the CC3000, because the flash SPI access will be active when the CC3000 CS foes active…