Details on the DB server | by lmbedore |
Well, I rolled up my sleeves, got forensic on that ass, and set to work on finding out why the database server crashed.
The crash was due to a kernel paging error at a memory address ~14MB before the end of our swapfile. What's odd about this is that there's essentially no way in hell there could be a bad sector on this drive... It's a Quantum Atlas 10K, after all. It was Vahman's main system drive until he got an Atlask 10K-II. In other words, it's thoroughly tested. Nevertheless, I have done exhaustive tests on the drive and swap partition to ensure that it is (and it IS) in perfect condition. I found out today (by RTFM'ing) that Enhanced Real Time Clock Support is more-or-less required for safe multi-processor operation. I recompiled the kernel and now we're just waitin' for it. Now, realistically, I shouldn't even worry about swap space on a Dual Xeon database server that's only using 9.8MB RAM anyways. I've never seen it above 11MB in use. That's all I have to say when some jackass recommends I run MS SQL server on NT. :) It's all about efficiency. I don't know why I'm even really making all this info public, other than the fact that I like it when service providers are honest with me. Now I'm honest with you, so I hope you like it. :) Once the DB server is stable without crashing for 2-3 weeks, we'll open up database access to users that want it (and have a decent use for it). If you want to make an SQL-driven CGI chat system, I'll choke you till you're pale blue. :) |