In order to ease programming of heterogeneous architectures with explicitly managed memory hierarchies such as the CELL processor, we propose a solution adopting the BSP model as implemented in the parallel programming language NestStep. This allows the programmer to write programs with a global address space and run them on the slave processors (SPEs) of CELL while keeping the large data structures in main memory. We have implemented the run-time system of NestStep on CELL and report on performance results that demonstrate the feasibility of the approach. The test programs scale very well as their execution time is mostly dominated by calculations, and only a fraction is spent in the various parts of the NestStep runtime system library. The library also has a relatively small memory footprint in the SPE's software managed local memory.