Comment on Pandas
tequinhu@lemmy.world 2 months agoIt really depends on the machine that is running the code. Pandas will always have the entire thing loaded in memory, and while 600Mb is not a concern for our modern laptops running a single analysis at a time, it can get really messy if the person is not thinking about hardware limitations
naught@sh.itjust.works 2 months ago
Pandas supports lazy loading and can read files in chunks. Hell, even regular ole Python doesn’t need to read the whole file at once with
csv
tequinhu@lemmy.world 2 months ago
I didn’t know about lazy loading, that’s cool!
Then I guess that the meme doesn’t apply anymore. Though I will state that (from my anedoctal experience) people that can use Panda’s most advanced features* are also comfortable with other data processing frameworks (usually more suitable to large datasets**)
*Anything beyond the standard
groupby
-apply
can be considered advanced, from the placrs I’ve been **I feel the urge to note that 60Mb isn’ lt a large dataset by any means, but I believe that’s beyond the point