OPs question is just any audio that strikes the listener as being a “real” sound. Doesn’t have to be long. Doesn’t have to be a song.
Because it just has to be “a” “real sound” i think there is an inherent measure of subjectivity. I might think a sound sounds like something you might not.
I think I’d approach this differently. I’d just pick a short time frame (maybe 0.5s) and generate 64kbs (PCM bitrate) worth of noise.
What percentage of those should have waveforms with any shape whatsoever within the domain of human perception. (What percent of random noise has the possibility of representation of a limited physical system interacting with the atmosphere in a way the human ear could perceive it)
Then, of that, subjectivity what percentage of those sounds “sound like a thing”.