**The evolution of N-body systems, like galactic nuclei or globular clusters, is a source of different astrophysical phenomena which can be simulated using computational algorithms. Due the theoretical scale complexity of this simulations, O2 and even O3, serial programming (non-parallel) does not suffice: We need to use techniques and tools related to High Performance Computing, including the coding for massively parallel processors, the so-called Graphics Processing Units (GPU), in CUDA.**

To analyse the dynamical evolution of a dense stellar system including relaxation, the most accurate technique at our disposal are direct-summation integrators. This is a very expensive method because we integrate all gravitational forces for all stars at every time step, while carefully taking into account the exchange of energy and angular momentum between them without making any a priori assumptions about the system.

One example is the family of dynamical codes for particle systems with relaxation processes of Sverre Aarseth, which have been developed for more than 50 years. This approach relies on the improved Hermite integration scheme. Since these approaches integrate Newton’s equations directly, all Newtonian gravitational effects are included naturally. However, these codes have historically been written to integrate globular clusters and not galactic nuclei, where we know that a massive black hole dominates the gravitational potential within a critical radius, a so-called “semi-Keplerian system”.

The black hole leads to extremely important astrophysical phenomena, like the tidal disruption of stars that come too close to it, or to the capture of compact objects via the emission of gravitational waves, as predicted by the theory of general relativity.

The basic Hermite algorithm of these direct-summation integrators is not suited to cope with this situation, and leads to energy errors. If one however splits the forces and takes into account the deviation introduced by the massive black hole by integrating the different kind of orbits using simple Kepler analysis, the error can be taken care of and avoided.

We are working on this problem by developing from scratch a high-modular Graphical Processor Units (GPU) code. The program is finished in its bare-bones dynamical form, i.e. it integrates a dense stellar system from a pure dynamical standpoint, but thanks to the hard-wired modularity of the code, new physics can and is being added to it, such as tidal disruptions, the role of stellar collisions and gravitational radiation processes. We also are developing a GPU-parallel version which, thanks to our access to one of the largest GPU clusters in the world, will allow us to deliver the first realistic simulations of globular clusters.