Laravel: Seed 50k Rows Faster - Create vs Insert vs Chunk

preview_player
Показать описание
Let's compare the performance of three ways to add a lot of data to the database.

- - - - -
Support the channel by checking out my products:

- - - - -
Other places to follow:
Рекомендации по теме
Комментарии
Автор

To avoid to load whole array in memory use php "yield" and laravel lazy collections
You can use ->chunk() on a lazy collection

ClCfe
Автор

Tip: Don't use array_push() to add one element to an array, it's twice as slow compared to "$array[] = $element;".
So the twitter example could be improved even more :)

vaultdweller
Автор

Sometimes my static and initial data comes from a XLSX/CVS file given by business owner. e.g: list of predefined cities, services and ...
And before import, it needs to be change and convert conditionally. In this situations:

I Prefer to use Console command for import initial data and enjoy using eloquent magics and helpers freely. It's obvious that executing this command takes a bit long.

But once successfully imported, I export the data as SQL file and from now on until significant data source change, I use this SQL file in seeder with this command:


With this approach, I benefit from eloquent magics and helpers AND also pretty good speed at every db:seed execution at the same time :)

mhyeganeh
Автор

Thanks for credit!

And yes, you pronounced my name right 😆

rodolfovmartins
Автор

The issue with using array chunk is you are loading the whole data set into memory and then loading it into the database, so the overall memory footprint would be larger (potentially double) if you do it in 3 steps:
1 - Import CSV
2 - Map CSV --> Array
3 - Insert in chunks

When I have to bulk load like this I typically do it as I'm building up the "mapped" array instead of after complete hydration to avoid that extra memory being used.

AndrewJudd
Автор

I found Carbon to be slow in these cases. So instead of using "now()", I use date('Y-m-d H:i:s', time()). This makes it much faster in my experiences.

SamBennettWebDeveloper
Автор

Thanks for such videos analysing such basics that help to optimize own work.

Gabriel-izkp
Автор

I always use upsert for large amounts of data or arrays if the table has the uniques available.

Stoney_Eagle
Автор

Try manipulate chunk size (50, 100, 250) and use db teansactions for each chunk. That helps me reduce time even more.

grzesiekb
Автор

This is a post from Rodolfo Martins, but he substituted his last name with Rodolfo Vasconcelos, a different colleague developer. I wonder why did Martins mislead us about Vasconcelos's twitter.

mikulcek
Автор

I never really thought about inserting such amount all at once though

sixpathdev_official
Автор

Good video. Using createMany with chunk would solve the problem of performance by inserting many records in one query in the database ? Probably the performances are close to insert but it's still possible to use factory for example.

olivierperrier
Автор

Incorrect comparsion.
Try similar trick for `Create`.
Try `Generators`.
If MySql worked without bug, then we'll see another case.
Because mass insert - always faster.
But try `Insert` in body of `foreach`. One by one. It very slow.

webdev
Автор

Wow, nice info sir. I usually do a lot of chunk when it comes to update record or get lot of result. What I want to ask is, do you usually use seeder in migration? I mean literally in migration file, not in seeder file. When I'm working with running web application and small team, my team leader like doing php artisan migrate with seeder in it rather than doing php artisan migrate and then php artisan db:seed BlaBlaSeeder.

mohammadiqbalramadhan
Автор

I hadn't come across this ::insert method before. Is it similar to

NathanBudd
Автор

It would be nice if something similar exist with update or delete

jeremiebergeron
Автор

What if I have to get the product after filtering from 70 to 80k products?
Can Laravel suitable for such ecommerce projects ?

ammadsahb
Автор

Can we get the inserted IDs from bulk insert ? Because i need those IDs for other relational tables

pilotcoding
Автор

How if want use array_chunk with factory?

agungprasetyonugroho
Автор

nice tutorials, as always
can you please share tips to increase speed on read csv with at least 10 millions rows?
and at the same time insert them to DB while filtering for any duplication?
I got always error timeout during this process :(
thanks in advance

erwinyonathan