Writers Creek

PHP

Profiling a data organizer

by James on May.18, 2009, under MySQL, PHP

From the past week or so i was building a data organizer, here is what it does:

  1. Collects data from various sources.
  2. Checks for duplicates, and stores for further checking.
  3. Spawns multiple threads to check the stored data.

Initially i used text files to store the data, and the data was checked for duplicates using  array_unique() function.

This script took alot of memory because inorder to remove the duplicates correctly i needed every chunk of data to be stored in an array first. The constant file reading and writing took the cpu to 90% usage aswell.

I then used usleep() function to sleep the process inbetween and decrease the constant cpu usage, was able to bring it as low as 20% after making the script sleep for 1 milliseconds after collecting each data fragment. However this alone wasn’t helping and i still had to do something about the memory so i created a MySQL database and started storing data in it.

MySQL can check for duplicate entries and discard them, solving the memory issue.

Next i used MySQL query preparing and was able to speed up the scripts performance by 40%.

Conclusion

  • Using MySQL is a good option if your dealing with a huge amount of data and you don’t want duplicates.
  • If your going to do run a lot of INSERTUPDATE or SELECT queries then preparing statements speeds up performance.
  • usleep() calls can decrease the cpu load and is able to sleep the process in microseconds.
Leave a Comment :, more...

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Visit our friends!

A few highly recommended friends...

  • Lyrics Archive

Archives

All entries, chronologically...