I am trying to update many rows (100 000+) in my database but it's taking a while (over 10 mins and still not finished). I'm wondering if this is intended behavior or is there something wrong in my code. To prevent the database from hanging while performing the update I've been told to update one row at a time, not sure if this is how it should be implemented.
I am updating images in my song table to be null if those songs were played in my playlist table
private function updateBlogSongs ($blog_id) {
$db = Yii::app()->db;
$affectedRows = 0;
$sql = "SELECT *
FROM `firstdatabase`.song s
INNER JOIN `seconddatabase`.playlist p ON s.name LIKE p.song_name";
$dataReader = $db->createCommand($sql)->query(); // Rows from the song table that were played in the given blog
$row = $dataReader->read();
while ($row != false) {
$sql = "UPDATE `firstdatabase`.song s
SET s.image = NULL
WHERE s.song_id = " . $row['song_id'];
$affectedRows += $db->createCommand($sql)->execute();
$row = $dataReader->read();
}
return $affectedRows;
}
Edit: after reading The Dog's comment I made some changes: With 500 000 rows in the song table it takes about 10 minutes if I increase my batchSize to 10000 (was taking 8 hours with the code above). At 250 at the batch size it's taking about 50 minutes. I chose 250 because the query takes about 1 second to run, and it's taking 10+ seconds to run at 10000 batch size (constraint is 1 second). I would like to make it faster but not sure what else to change
$batchSize = 250;
$lastSongID = 0;
$rowIndex = 0;
$affectedRows = 0;
$sql = "SELECT max(song_id) FROM `firstdatabase`.song";
$lastSongID = intval($db->createCommand($sql)->query()->read()['max(song_id)']);
echo($lastSongID . ' songs in table.' . PHP_EOL);
echo('Updating songs...' . PHP_EOL);
while($rowIndex <= $lastSongID) {
$startTime = microtime(true);
$sql = "UPDATE `firstdatabase`.song
SET image = NULL
WHERE song_id in (
SELECT song_id
FROM (
SELECT song_id, name
FROM `firstdatabase`.song
WHERE song_id > " . $rowIndex . "
LIMIT " . $batchSize . "
) s
INNER JOIN (
SELECT DISTINCT song_name
FROM `seconddatabase`.playlist
) p ON s.name LIKE p.song_name
ORDER BY s.song_id ASC
)";
$affectedRows += $db->createCommand($sql)->execute();
$rowIndex += $batchSize;
$endTime = microtime(true);
$elapsedTime = round($endTime - $startTime, 2);
}
This is really more a question for the SQL world instead of the PHP world but here's my recommendations:
Don't do this one row at a time in a while loop. Make a more complex update statement that can do it all in one database hit. Database commands are the slowest part of your php code, you want to limit the number of calls you do to the database.
When you are confident that you can get the operation done in one sql command, or even if you don't think it is possible then pull your code into a stored procedure in the database. Having complex sql queries as stored procedures can help a lot with maintaining your code.
Make sure you have indexes on your tables. You need to make sure your queries hit those indexes for best performance.
Here's an option for the single query:
update `firstdatabase`.song
set image = null
where song_id in (
select s.song_id
from `firstdatabase`.song s
INNER JOIN `seconddatabase`.playlist p
ON s.name LIKE p.song_name"
);
Obviously we don't have access to your database so you'll need to make changes where necessary but hopefully it can get you on the right track.
EDIT: Try replacing your second code set with the following:
$lastSongID = 0;
$rowIndex = 0;
$affectedRows = 0;
$sql = "SELECT max(song_id) FROM `firstdatabase`.song";
$lastSongID = intval($db->createCommand($sql)->query()->read()['max(song_id)']);
echo($lastSongID . ' songs in table.' . PHP_EOL);
echo('Updating songs...' . PHP_EOL);
$startTime = microtime(true);
$sql = "
update `firstdatabase`.song
set image = null
where song_id in (
select s.song_id
from `firstdatabase`.song s
INNER JOIN `seconddatabase`.playlist p
ON s.name LIKE p.song_name"
)";
$affectedRows += $db->createCommand($sql)->execute();
$endTime = microtime(true);
$elapsedTime = round($endTime - $startTime, 2);
If it works, then let me know the time it takes to run, if it doesn't work, is it an issue with the SQL (again I can't see the tables so I'm guessing).
Our community is visited by hundreds of web development professionals every day. Ask your question and get a quick answer for free.
Find the answer in similar questions on our website.
Do you know the answer to this question? Write a quick response to it. With your help, we will make our community stronger.
PHP (from the English Hypertext Preprocessor - hypertext preprocessor) is a scripting programming language for developing web applications. Supported by most hosting providers, it is one of the most popular tools for creating dynamic websites.
The PHP scripting language has gained wide popularity due to its processing speed, simplicity, cross-platform, functionality and distribution of source codes under its own license.
https://www.php.net/
Yii is a simple yet high performance generic component framework based framework. It is known for its high performance, but above all, it is famous for its simplicity. This framework appeared in December 2008. It allows you to use third-party code, and its Gii code generator allows you to quickly create basic structures from which you can build your own solutions.
https://www.yiiframework.com/
Welcome to the Q&A site for web developers. Here you can ask a question about the problem you are facing and get answers from other experts. We have created a user-friendly interface so that you can quickly and free of charge ask a question about a web programming problem. We also invite other experts to join our community and help other members who ask questions. In addition, you can use our search for questions with a solution.
Ask about the real problem you are facing. Describe in detail what you are doing and what you want to achieve.
Our goal is to create a strong community in which everyone will support each other. If you find a question and know the answer to it, help others with your knowledge.