I really don’t think that’s a lot either. Nowadays we routinely process terabytes of data.
Comment on Pandas
Kausta@lemm.ee 4 months ago
You havent seen anything until you need to put a 4.2gb gzipped csv into a pandas dataframe, which works without any issues I should note.
QuizzaciousOtter@lemm.ee 4 months ago
Kausta@lemm.ee 4 months ago
Yeah, it was just a simple example. Although using just pandas (without something like dask) for loading terabytes of data at once into a single dataframe may not be the best idea, even with enough memory.
whotookkarl@lemmy.world 4 months ago
It’s good to see the occult is still alive and well
thisfro@slrpnk.net 4 months ago
I raise you thousands of gzipped files (total > 20GB) combined into one dataframe. Frankly, my work laptop did not like it all that much. But most basic operations still worked fine tho