I haven’t tried with API yet. With local one it’s super quick though, it adds maybe a couple of seconds at most.
If you have a GPU and want to try it with local one there is a plugin for Godot and unity called “Nobody who” they have implemented the RAG approach out of the box with examples as well. So it wasn’t something I came up with.