Morgan Tocker
[Most Recent Entries]
[Calendar View]
[Friends]
Below are the 20 most recent journal entries recorded in
Morgan Tocker's LiveJournal:
[ << Previous 20 ]
| Tuesday, August 11th, 2009 | | 2:31 pm |
Analyzing Joyent's Accelerator for MySQL
The Joyent Accelerator for MySQL is apparently 2-4 times faster than an EC2 instance, but there's no mention of configuration, database size or even what the queries were in their record breaking performance tests. I assume that Joyent has tuned their MySQL install, so it only seems fair not to use the default configuration on the EC2 test. If you look at Vadim's EBS benchmarks (particularly random read/write) it looks like they may have a very good product, but instead we're left with the impression that they have something to hide. 1,245 transactions per second isn't very much these days if they are simple queries. There are plenty of mysql installations doing ten times that. I'll update this post if they respond to me on twitter. I'd honestly like to dissect and look at what they're doing. Update: Joyent provided an this: @morgo watch for a blog post this week with the details. configurations and results of the internal test available now on the site #mysql | | Monday, August 10th, 2009 | | 11:03 am |
Planet MySQL: What are you going to do with voting data?
I'm a fan of voting on planet mysql posts - but I've also realized that since I'm logged in, it allows someone to keep a pretty valuable silo of information on what interests me. If I vote up a Kickfire or Infobright post, should I expect a sales call to follow? There's nothing in the FAQ about what happens with the data. | | Friday, August 7th, 2009 | | 2:21 pm |
Understanding the MySQL forks I put together the following diagram to explain what the origin of the current MySQL forks and deltas looks like:
But in the distributed revision control world we live in, it's never really that simple. Here are
some other notes:
- XtraDB is an InnoDB fork and "mostly" only a storage engine. It has a future in MariaDB
and Drizzle, but may not make it into MySQL (about to be owned by Oracle, who also owns InnoDB).
- OurDelta collects more patches than just the Percona patches, and enables the PBXT storage engine. There's also a repository of OurDelta for 5.1 - binaries just aren't out yet.
- Like XtraDB, PBXT has a future in Drizzle, MariaDB and *maybe* also MySQL (Oracle may be more
friendly towards PBXT than XtraDB, but Oracle is not known to be friendly so who knows!).
I left PBXT absent from the diagram for not having much of a lineage with previous MySQL releases (flames
welcome if you disagree).
- XtraDB draws a lot from the Google patches for MySQL/InnoDB (not present on the diagram either).
- Drizzle is now incompatible with everything else in MySQL-land terms
(replication, partitioning, etc), and they're happy to be. In storage engine terms,
they're still compatible though, which leads me to describe Drizzle as a completely new
"userland", with storage engine options still remaining very similar.
- MariaDB is both a new storage engine (Maria) and a userland "delta" of MySQL. I call it a delta
since it is trying to keep binary-format compatibility with MySQL wherever it can. This means that
it makes a good retrofit, but it limits what they can do without changing things like the .frm format, etc.
- The InnoDB plugin has a weird history. Oracle announced it as an optional replacement
for the 5.1 InnoDB but not much has become of it since introduction. I would have expected
5.4 to be based off the plugin, but it's not (at least at this point).
- MySQL is the real loser in the direction the patches are moving. While all other
forks are free to share amongst themselves (where compatible) and take from MySQL, MySQL will only
accept a patch if the author signs a Contributor's License Agreement. They need to do this - otherwise
they can't sell OEM copies, which still make a large chunk of sales.
Until recently, MySQL didn't accept any InnoDB patches unless they came downstream from Oracle, and
Oracle keeps InnoDB development as a closely guarded secret - which makes influencing it very difficult.
Comments welcome. | | Friday, June 5th, 2009 | | 1:43 pm |
Hidden gems in 5.1
I think 5.1 gets some bad press for not being a compelling upgrade. It's not the big features that make the difference, but the subtle ones. I wanted to highlight some of the these that may make your life easier when it's time to upgrade: * Prepared statements can now use the query cache. BUG #735* InnoDB auto_increment insertion is more scalable. Manual Page* The long_query_time can be set to values less than 1 second. BUG #6238* SHOW PROFILES is available for everyone, not just community users! * Creating Triggers no longer requires the SUPER privilege. Let's hope for 1000 more of these in MySQL 5.4. | | Thursday, May 28th, 2009 | | 3:58 pm |
Meetup.com can burn!
In one week, the MySQL/Sun sponsored Meetup.com subscriptions expire officially. While it's good to see that plenty of offers are available for sponsorship, I am kind of surprised that Meetup.com hasn't received much bad press out of this. To recap what happened: - (some years ago) Meetup.com introduced a great service for organizing meetups, and it was free! - Then they decided that they couldn't make money this way, so they switched to a paid-service-only model. - Most meetups on other topics started using things like Upcoming and Facebook, but MySQL negotiated a deal to have all MySQL meetups sponsored. - Meetup.com decided they could make more money out of MySQL by charging for the groups individually, so they canceled the sponsorship deal. I'm not against Meetup.com charging, but the way they've essentially made a business is by Bait and Switch. I don't think we (as Open Source citizens) should encourage that, since one of the things we strive for is freedom from the vendor lock-in that makes this possible. | | 9:35 am |
Efficient way to copy large amounts of data?
Dear Lazyweb, Yesterday I tried to Rsync a MySQL data directory from serverA to serverB on the same network. I thought that if out of a few hunded gigabytes maybe 2% changed, this should work, right? Wrong. Rsync is designed to minimize bandwidth, so in my case it was *much* quicker to wipe the data and start again (I feel this is something I should have known earlier, but it doesn't hurt to try and share your mistakes). Which gets me thinking - it doesn't have to be this way. Bit-torrent works in a similar way to Rsync, but it's certainly not network efficient. Are there any projects similar to Rsync that are using network-hungry algorithms to try and make sure that two directories are in sync with the goal just being as fast as possible? | | Sunday, April 26th, 2009 | | 10:41 pm |
| | Wednesday, April 8th, 2009 | | 3:18 pm |
Very Simple Introduction to Using XtraBackup on Max OS X
I've started using Xtrabackup to backup MySQL on my MacBook. Here's an example of a quick backup and restore: 1. Download the latest .tar.gz from Percona: $ cd /tmp $ wget http://www.percona.com/mysql/xtrabackup/0.5/xtrabackup-0.5-macos.x86_64.tar.gz $ tar -xzf xtrabackup-*-macos.x86_64.tar.gz $ cd xtrabackup*
2. This directory should contain innobackupex-1.5.1 and xtrabackup. You need to install these into a directory that appears in your $PATH. In my case, I am going to group it install it where my MySQL binaries are located (/usr/local/mysql/bin): $ ls innobackupex-1.5.1 xtrabackup $ cp * /usr/local/mysql/bin/
3. Create a directory where you want your backup to go. In this case it's just a demo - so I'll use my tmpdir. $ mkdir -p /tmp/backup
4. Test running the backup: $ innobackupex-1.5.1 /tmp/backup/
Quick Explanation:innobackupex-1.5.1 is a Perl script that insures that all of your non-InnoDB tables and other MySQL meta data is backed up. You can think of it as a wrapper around xtrabackup, which backs up the data inside InnoDB. 5. Check the data is backed up: $ cd /tmp/backup $ ls 2009-04-08_15-12-52 $ cd 2009-04-08_15-12-52 $ ls backup-my.cnf mysql-stderr xtrabackup_binlog_info employees mysql-stdout xtrabackup_checkpoints ibdata1 test xtrabackup_logfile mysql
6. Attempt a recovery: innobackupex-1.5.1 --copy-back /tmp/backup/2009-04-08_15-12-52
Disclaimer: In case you didn't know it, I work for Percona - the company that wrote xtrabackup. | | 1:48 pm |
Who is going to make MySQL easier to use?
This may appear a bit of a rant - but it's really intended as more of an observation from having trained people how to use MySQL, and noticing that everyone seems to make the same beginner mistakes. If you read the " Continued MySQL Values" on the MySQL Website, you'll notice that the third one in the list is:
- The best and the most-used database in the world for online applications
- Available and affordable for all
- Easy to use
- Continuously improved while remaining fast, secure and reliable
- Fun to use and improve
- Free from bugs
Note that "Fun to use" doesn't sound much like a database, and "free from bugs" will always be a distant dream ;).
It was Easy to Use that got me into MySQL, but I think this is one of the goals that has lost focus over the years. If you look at bugs.mysql.com, there are a lot of annoying little S5 (Feature Requests) that would probably take the right person only a few minutes to fix. The sort of things I am talking about are:
- SHOW SLAVE STATUS - Has at least one annoying ease of use bug. If it refuses to connect to the master because it shares the same server id, it won't show you that here. You have to go to the log file.
- SHOW SLAVE STATUS Also shows you the 'last error' it incurred, with no way to clear this error. It confuses beginners
- What is the difference between this wait_timeout and interactive_timeout thing, and why is it that when I set wait_timeout to 0 it converts to wait_timeout =1 and disconnects me? - I would prefer wait_timeout=0 to mean unlimited - not that I can think of too many good reasons to use it.
- The anonymous user serves almost no practical use.
- Changing InnoDB log file size requires you to do some shutdown and rename trickery, when this should be automatically done for you.
- When connecting to a server on -h localhost mysql decides that "oh, you want the socket file", and due to misconfiguration it might go to the wrong location and tell you it can't connect even though your server is running.
- The error log file is very bad at offering levels of configurable verbosity, or consistency which would allow you to grep through it for errors easier. Often it tells me an error code, which I am supposed to look up in perror - but most beginners don't know about perror.
- ...
But they are probably never going to be fixed. I think it is sad to see that many MySQL users have now seen that these issues don't get fixed, so they no longer submit feature request bug reports. My theory is that:
- MySQL/Sun is busy implementing new features to snag new customers. If they update a feature (such as the subquery optimizations in 6.0) it is normally going to be one of those big compelling features that all of their customers are demanding.
- The small number of Community Contributors are most likely going to be writing patches to scratch an itch (either better performance or diagnostics). It's not going to be one of these annoying little things they long ago discovered and will no longer care about.
It's good to see that Drizzle has actually been very good at attacking a large number of items on my list. It's just sad to have to wait until Drizzle is ready, because this part of MySQL isn't broken, it just needs some love. | | Wednesday, March 18th, 2009 | | 9:39 am |
New Amazon EC2 Features - Reserved Instances
So you can now reserve EC2 instances - which brings the cost down to about 6.7cents/hour averaged over a year. It's a great idea. I wonder if the next step will be to allow more customized instance types at a price premium, provided a reservation is made for 1-3 years. I could certainly use a machine with faster IO for MySQL boxes. | | Wednesday, February 11th, 2009 | | 5:33 pm |
I wish I had more levels of verbosity in logging I've been working as a Ruby on Rails developer the last couple of months. It's interesting to see how my impression of MySQL changes when I'm on the other side - and using a development environment I am less familiar with. Here are a two things I wished I could have been able to do:
- When
--log-warnings=2 is enabled, log all statements the server receives that cause warnings or syntax errors.
- When
--log-warnings=2 is enabled and --some-other-setting, log all statements which return empty results.
Not that it caused me too much pain - but I think I could have benefited. I think I've read something about both of these before too... anyone know if it was in the Drizzle or Google patches? | | Friday, January 30th, 2009 | | 4:42 pm |
MySQL Consulting Companies
Has anyone else noticed that almost all of the consulting companies that support MySQL (and blog) start with a 'P'? ( Percona, Pythian, Proven Scaling). I think OpenQuery needs a name change to keep up with the market. Even though 'O' is the letter next to 'P' in the alphabet, it requires more than one bit flip - so it's not that close. | | Sunday, January 25th, 2009 | | 6:08 pm |
My take on the Sun Database Group visa issues
Seeing this post make the news today really interested me - since I had the (dis)pleasure of being personally involved. In the Australian spirit of 'giving word to the underdog', let me provide some clarity. But before I do - my kudos to Kaj for already following up and correcting himself on the gray details. The person in question ('KV') was not going to Australia to speak at a conference, but deliver a public training course. To do this, you do need a business visa. Heck, you need a visa if you want to teach in the USA - so those speculators calling Australia some draconian system that doesn't understand Open Source is just wrong. Some departments know it very well. A lot more than my adopted home of Quebec ;) The only advantage the USA has over Australia, is that speaking at conferences can be done with the VISA waiver system. But then again, in Australia's defense the visas are acquired online, and much easier to acquire than my USA one was ;) Why was local_mysql_activist upset?local_mysql_activist runs a business in Australia training on Open Source technologies. On the same trip KV was due to teach in Sydney - KV was supposed to teach in Canberra. And it was during that week in Canberra, local_mysql_activist had *already scheduled a class* that would no doubt compete for potential customers. But competition is good, no? Well, yes. But Sun can survive a lot longer on classes that only half fill than local_mysql_activist can. In the spirit of healthy competition, they could have picked the week before/after - but that's their choice. There are also other cities in Australia that come to mind before Canberra, which would have filled up. The real problem with the Sun/MySQL course is that it never actually ran. It was canceled at the last minute due to low numbers, and customers were offered credits/refunds. They probably didn't have enough time to book in to local_mysql_activist's class - so in the end he was the real loser. The show did go on!At the time I was also in the same training group at MySQL/Sun - and I happen to hold an Australian passport. When KV was put-off by local_mysql_activist - the show still went on. People in Australia still got their MySQL DBA course. I think someone somewhere should acknowledge this bit in their stories. I don't support the way local_mysql_activist went about things. Involving a government is just a messy, messy, disaster. But I can fully understand his frustrations. What I would have done is capitalized on what MySQL's can't do - offer a completely reputable, third party criticism on which features work and which ones don't; i.e. Students used to ask me questions about what guides we had on migrating from Oracle to MySQL. I used to tell them there was no really definitive guide, but if there was - don't you think we have a conflict of interest in producing it!? Visa issues suckVisa issues suck. I completely agree with Kaj on this one, having gone through both a Canadian and a USA visa myself. Governments in general just don't know how to deal with the fact that you can be employed in a different country to where you work. Initially I was told by an immigration official that I wouldn't require a Canadian visa since my travel loosely met the definitions of a 'business traveler' here on business from Australia, as long as I never received employment in Canada. That was until I came back in the country one day, and they detained me in immigration for a few hours, while they questioned me about my work - and eventually decided they were going to let me in, but I had 50 days to get a proper visa or get out. I only just made it ;) [Disclaimer - I no longer work for Sun Microsystems. If I wasn't so busy at my new job, I would have checked my RSS reader and replied earlier!] | | Wednesday, December 24th, 2008 | | 3:46 pm |
Heading to India
I'm about to leave for India for a few weeks - so if you're looking for me in Montreal - I won't be there. What brings me there? this.. | | Thursday, December 18th, 2008 | | 9:03 am |
IO scheduling in the 2.6 kernel
I was surprised by even the gap I saw on Vadim's post on the improvements of using the Noop IO scheduler. I've been changing my thoughts on what to set the scheduler to lately, and it's all leaning to Noop as the default. An explanation first:IO Schedulers (aka elevators) are a method of trying to get the best possible performance out of your disk subsystem as possible. Since your disk is essentially a mechanical device - it has a difference in performance between whether or not you are performing actions sequentially - or when you are performing actions randomly. And this difference can be huge! Last time I tested, a typical 7200RPM consumer hard drive could write 60MB/s sequentially, but performance dropped to only a few MB/s when I started trying to write small pieces of random data. So how do the IO schedulers work?They achieve this (mostly) by doing request reordering and merging, and by trying to read platters in one continuous direction. They may even detect that you are writing sequential blocks, and slightly delay an operation in order to 'save cost'. Each IO scheduler will have different algorithms regarding how they do this reordering. For example, on a desktop Operating System you are probably more concerned about your MP3s not skipping than about the maximum sustained performance. Death to schedulersThe problem with using techniques like IO scheduling is that the Linux kernel is pretty dumb to all the layers below it. Hard drives themselves have their own scheduling mechanisms, and if you are running a RAID controller *it* will have it's own scheduling mechanisms. The last point is important - If you are doing scheduling when you have a RAID controller, from Linux's perspective it's probably all one big block device. The scheduler is making all sorts of assumptions about blocks being aligned on disk and it's WRONG WRONG WRONG - you probably have some sort of striping. So all the IO scheduler is doing is adding latency (bad) and to probably applying some partial serialization to writes (double bad). So in that case, it's better to tell Linux to mind it's own business. In which case you want the Noop scheduler. If you are curious where to learn more, I think the best references to learn more about scheduling have been some of the talks by the Youtube guys, and an earlier post by Domas Mituzas. | | Wednesday, December 17th, 2008 | | 1:13 pm |
Why you don't want to work in a travel job for Sun
I left Sun over a month ago. Despite my manager approving my expense report, the accounting team has refused it. Now I'm supposed to pay a $6,000 Amex bill while I wait for them to figure out what they are going to do. | | Wednesday, December 10th, 2008 | | 10:05 am |
On Synergy: Culture conflicts between Sun and MySQL
Working at Sun was my first acquisition experience. I guess it was what I expected; managers hyping it up about being a "perfect match", and how much the two companies had in common. It was kind of interesting to see this even turned up a notch after they received additional "Sun management training". Anyway, I digress.... I'll state upfront I consider my experience a bad one (but I'll save the personal stories for another day). Here was an issue I saw while training Sun staff on how to user MySQL: Sun's has a conflict of interest in selling hardware.MySQL (InnoDB) doesn't actually *work* on big computers. It only scales up to about 4-8 CPU cores, and then it hits all sorts of internal bottlenecks. Most architectures work around this by using many small machines rather than one big one (aka "scale out"). But for Sun the profits are larger on selling *bigger* hardware. Most of Sun's bigger hardware (SPARC) has many more CPU cores, but each of these cores are infact slower than most Intel/AMD cores. So it doesn't work. I'm not sure that the "old guard" of Sun Sales people will take to selling smaller, lower margin systems. I can predict them still trying to continue to either sell bigger machines (and suggest deploying Oracle), or sell bigger machines that are actually unsuited to MySQL[1]. I remember hearing a Clayton Christensen talk on when Intel launched Celeron - and they sold them out of a completely different office. That sounded smart. I think the idea of using commodity hardware installing DRBD+Heartbeat was the hardest to explain to Sun employees in HA classes. They didn't see why someone wouldn't buy a $5,000-$10,000 SAN and be done with it (Note: I should point out DRBD has other advantages besides being a low-cost SAN replacement). [1] This review is just one example. The analysis is even more interesting. | | 9:49 am |
Eating your own dog food.
I'm pretty happy to hear that the MySQL Website, and the MySQL Bugs system are powered by 5.1. I think this is a real step forward from when 5.0 was released. I just want to know when the support.mysql.com website will use 5.1. It has a lot heavier requirements, and with contracted SLAs to customers Sun would be making a real commitment if it were to upgrade that. | | Friday, November 28th, 2008 | | 2:16 pm |
There's nothing point one about 5.1
MySQL 5.1 is GA. Yay! A lot of new features have been added, and the numbering convention of just adding a .1 doesn't really explain that. If I had of numbered it, I probably would have called it "6.0". In some ways MySQL has done both themselves (and DBAs) a small injustice. While working at MySQL I met a lot of customers that tended to be conservative - they don't install first releases, but instead wait for the second release[1]. In the case of 5.1, just be aware that there will be quite a few more features, and with it will be more bugs. I think it's more stable than 5.0 - but you will still need to do plenty of testing. I'm happy to see it finally released though - 3 years in the making! But again if I had it my way, it would have been good to see a real "Point 1" release to 5.0. There were a lot of new features introduced in late 2005 that only required small addition. Changes that were large enough that the current 'no new features in a GA release' rule restricted, but not big enough to break 99% of applications[2]. I can only hope that 6.0 is 5.1's "point 1 release", and not just a deluge of new features 2 years late. Partitioning could be awesome if things like the "can't mix storage engine" limitation were lifted. Quickly. I probably will be waiting at least until Percona and OurDelta update to 5.1 GA, and perhaps another month after that. [1] I think Oracle causes this - having traditionally offered much stronger second editions. [2] For example; I now don't have to use the SUPER privilege for triggers, but I still have no way of using SIGNAL in a stored procedure. It's a shame that for both of these the compile cache is still per-connection. | | Monday, November 24th, 2008 | | 1:39 pm |
Dead-simple server monitoring solutions
Dear Lazyweb. So, here's my thoughts: * Create a basic php/rails/insert your application language choice page. * Have it do a simple SELECT 1+1 from MySQL. * Print the results to screen. Are there any third party (independently hosted) monitoring tools out there that (for free or cheap) I can use to then connect to this page, and make sure the results are as expected? |
[ << Previous 20 ]
|