I decided to install postgresql in a zone on my home server this weekend to play around with PostgreSQL some. So I decided to export my syswatch db from MySQL and import it into PostgreSQL. So I immediatly did a mysqldump -uroot -p syswatch > syswatch.dump. This gave me a file (i actually did each table to its own file but the concept is the same), that I then tried to import in to PostgreSQL without modifying it. (I knew this would not work but wanted to see exactly what wouldn't work). So I ended up editing one of the smaller tables to create it. Here are some of my notes:
- Even though in mysqlshow a column may show varchar, it is outputed as char. When defining a table in PostgreSQL this will be the full size of what ever you defined it as. For example 255 would be the size of the value plus white space till it filled the 255. So I changed the char(255) to varchar(255), much better
- The datetime value needs changed to timestamp
- int and bigint in mysqldump will have a integer after it for the size, this is not valid in PostgreSQL
- auto_increment is not valid, either user serial or a sequence. I chose to use sequences to replace my auto_increments.
- If you are using a recent version of mysql, make sure to use the --skip-opts option on the mysqldump. If you don't you will get the extended-insert type insert statements which PostgreSQL does NOT like, and running a 3,000,000 line file through sed to change the ),( to ); insert into blah values ( takes for ever and a day.
- The back ticks around the table names on the inserts need removed
That is all for now, doing a reexport of all the tables with --skip-opts so each insert statement is on it's own line. Will see if this makes it easier to get it up and running.
The whole purpose for this, other than trying to learn PostgreSQL is to see if PostgreSQL is any faster than MySQL.
Earlier I had started a job to clean up the PostgreSQL database on our main spam machine, here are some stats:
To reindex 6.4 million messages, 7+ hours
To reimport the header information of those messages back in to PostgreSQL: 38 hours 28 minutes, 29 seconds.
What is the outcome of this? The Spam DB went from 80+ gig down to 7.5Gig and the searchs are MUCH faster now. Hopefully this will fix some of the problems we have been having over the last week with people trying to release their spam messages.
Found out last night that our SPAM server's database was a little corrupted.. It is a PostgreSQL DB and it was over 80 gigs in size. So today I have spent so far 3+ hours getting it rebuilt.. First deleted (moved out of the way) the entire postgres directory and reinstalled it and populated some config info back in to it. Now comes the long task of reindexing the spam queue. Right now there are 5.5 million messages (just from the last 8 days) in the queue and the reindex has been running for only about 30 minutes, and it is only 8% done. going on that, looks like I will be here for a while. but since it is pouring the rain down outside, not much I can do any ways.
Well I think this is the latest post I have ever done. I watched the MacWorld 2006 video this evening, which was cool, and decided that I would run a vacuumdb on the Spam database (housed in a PostgresDB on a 8way UltraSparcIII with 32 GB of ram Sun F12k Domain) since I was running out of room in the file system that holds it. (Currently the DB is over 50GB in size). The last time I ran this it took about 2 hours to do it, it has now been running for 4 hours and it still going... To top that off, for what ever reason Adelphia's Digital Cable decided to go out tonight, I still get all the analog channels, just non of the digital cables work, which makes me mad as I was going to tape a couple of movies tonight, but all I got was black. Tried cycling the power on the reciever, nothing, tried changing to different channels, nothing. Even went in to the service mode (SA Explorer 3100, hold down the + on the front of the receiver and then when the "email light" comes on release and press the info button, 21 pages of goodness) and everything looked fine. So I guess I will wait and see what happens later today.
Anyways, back to the Spam DB, as I have been sitting here, it took 2.3 hours just to
go through one table. Does any one know how to make the vacuumdb run faster? The box is almost completely idle... I have done similar stuff in MySQL and it takes no where as long as this does. (Granted I am not as familiar with Postgres as I am with MySQL.)