Newsfeeds
Planet MySQL
Planet MySQL - http://www.planetmysql.org/
-
Farewell CHM, hello EPUB!
For a long time, the MySQL Documentation Team has been providing CHM files for most MySQL documentation we publish. Like many other formats, CHM-format docs can be downloaded from http://dev.mysql.com/doc. CHM (Compiled HTML Help) has been the de facto standard help file format on Windows since 1997, but the technology behind it is outdated and has all kinds of quirks. The successor format introduced with Windows Vista is AP Help, but it hasn't taken off in practice so far. So, with CHM being outdated and AP Help spread anything but widely, lots of vendors have started providing documentation on Windows in PDF or HTML format.Building CHM-format documentation is a challenge of its own. I'll not go into details here, so let me just state that it requires a dedicated Windows box (or VM), and while it can be automated using Power Shell commands, there's no way to find out whether or not a CHM file was built correctly, except by manual inspection. This makes it different from all other documentation formats where technical QA is done (successfully) in an automated fashion.With the increasing complexity and size of our documentation (the MySQL 5.1 Manual contains more than 1.6 million words now!), providing CHM has become more and more of a pain, because builds tend to break more often. We've stopped shipping CHM with the MySQL Server on Windows months ago because we simply couldn't guarantee that the help file shipped with the software would work. Also, we're running short on hardware resources, so we'd rather stop wasting the resources we have on building a format that's of limited use, anyway.This is why we'll stop providing CHM for any of the documentation we publish.To alleviate potential pains anyone might have with this decision, let me tell you that we've started providing EPUB-format docs. EPUB (see http://en.wikipedia.org/wiki/EPUB) is an open standard format for screen readers, mobile or not, and is fairly easy (and not resource-intensive!) to compile. Thanks to Lenz for suggesting to build EPUB!Go to http://dev.mysql.com/doc to grab MySQL documentation in EPUB format. To read EPUB on desktop machines, I use a Firefox add-on, unsurprisingly called epubreader, which loads EPUB documents fast and renders them nicely. That said, please be aware that EPUB can't do anything about the fact that the MySQL Reference Manual is huge, so downloading it to a mobile device can take a while. The MySQL 5.1 Manual is currently a whopping 15 MB!
-
VLDB 2010
I will be at VLDB 2010 next week. If anyone on this blog is attending and wants to catch up to discuss start ups and innovation in DB, NoSQL, Big Data etc drop me a line and I will try to meet up.
-
XS4ALL offer IPv6 connectivity to retail customers
Good news. I was told by a colleague that the Dutch ISP XS4ALL is offering IPv6 connectivity to its retail customers. You can see here although the comments are in Dutch. They also provide a list of ADSL routers which should work for their service. The Cisco name may not be surprising but this is good publicity for Draytek and AVM FRITZ!box for their products. Hopefully it will also stimulate other SOHO router providers into the act to get their names on the list. Let us hope that more ISPs start to offer this sort of service to their customers. I’m not sure if there’s a list of residential ISPs in each country which provide IPv6 connectivity. If not it might be worth making one and updating it as new providers off this service. I checked my ISP, Jazztel, and was not really surprised to see that neither technical support or the sales staff really new what I was talking about when I asked if they were planning on offering IPv6 support. That’s unfortunate, but I don’t think any other Spanish ISP is any better. That is any other residential ISP in Spain. I expect some of the larger bigger ISPs are likely to off this to business customers.
-
Integrating MySQL and Hadoop - or - A different approach on using CSV files in MySQL
We use both MySQL and Hadoop a lot. If you utilize each system to its strengths then this is a powerful combination. One problem we are constantly facing is to make data extracted from our Hadoop cluster available in MySQL. The problem Look at this simple example: Let’s say we have a table customer: CREATE TABLE customer { id UNSIGNED INT NOT NULL, firstname VARCHAR(100) NOT NULL, lastname VARCHAR(100) NOT NULL, city VARCHAR(100) NOT NULL, PRIMARY KEY(id) } In addition to that we store orders customers made in Hadoop. An order includes: customerId, date, itemId, price. Note that these structures serve as a very simplified example. Let’s say we want to find the first 50 customers, that placed at least one order sorted by firstname ascending. If both tables were in MySQL we could use a single SQL statement like: SELECT DISTINCT c.id, c.firstname FROM customer c JOIN ORDER o ON c.id = o.customerId ORDER BY c.firstname ASC LIMIT 50 Having the orders in Hadoop we have basically two options: We write a Map-Reduce job that reads all customers from MySQL and joins them with the orders stored in Hadoop’s HDFS. The output is sorted by firstname ascending. From the result we use only the first 50 entries. We write a Map-Reduce job to extract all distinct customerIds, write them to a table in MySQL and use a SELECT with a JOIN. In most cases option 2 will be the better choice if we have a non-trivial number of rows in our customer table. And that’s for three reasons: MySQL is not optimized for streaming rows. As our Map-Reduce job would always have to read the whole table, we would stream a lot. You cannot easily write something like a LIMIT clause in Map-Reduce. Even if you could, you’d likely have to read through all customer entries anyway. So the amount of data processed by the Map-Reduce job is significant higher if you use aproach 1. If you just started to move to Hadoop, most of the data structures like categories, product information etc. are still kept in MySQL and most of the business logic relies on SQL. In most application you would not move all your data to Hadoop anyway. So storing Hadoop’s result in MySQL simply integrates better with your existing application. So, storing Map-Reduce results in MySQL seems to be the better option most of the time. But you still have to write all customerIds extracted by the Map-Reduce job into a table. And that is a performance killer. Even if you use HEAP tables it puts a lot of pressure on MySQL. Other options like CSV storage engine are not feasible since they do not provide any keys. And joining without a key is never a good idea. Introducing: MySQL UDF csv_find() and csv_get() One of the big advantages of Map-Reduce is that it produces output sorted by whatever we want. So we could output sorted CSV files. And we could perform binary search on these sorted CSV files. Great! I have written two MySQL User Defined Functions (UDF) that provide find and get functionality on sorted CSV files. How to use it? Taking our example from above we transfer the resulting CSV file from HDFS to the local filesystem of our MySQL server and write a query like this: SELECT * FROM customer WHERE csv_find(‘/tmp/myHadoopResult.csv’, customer.id) = 1 ORDER BY firstname ASC LIMIT 50 And this is a lot faster than inserting the Map-Reduce result into a table. It might even be faster than our original SELECT statement where we assumed both tables customer and order are in MySQL since we are not using a JOIN at all. More on performance later on. How does it work? On initialization of csv_find the CSV file will be loaded into memory using mmap. And since the first column of the CSV file is sorted in ascending order we can simply use binary search on each call to csv_find. If you need to access other columns of a CSV use csv_get(<file expression>, <key expression>, <column expression>). Example: SELECT customer.lastname, csv_get(‘purchases.csv’, customer.id, 2) AS price FROM customer assuming that column 2 contains the price of a product purchased. Prerequisites The following assumptions are made and must be met by your CSV files: Column delimiter is ‘\t’ and row delimiter is ‘\n’. You can change this at compile time. The first column must be sorted in UTF-8 binary ascending order. “binary” means that it has to be sorted by byte value and not by a specific collation. For example ‘ä’ (0xc3 0xb6) comes after ‘z’ (0x7a). In bash you would sort a file like this: LC_ALL=C sort < input.csv > ordered.csv Remember that sorting comes for free in Map-Reduce. No escaping is done. If you need it, you could do the following: First, escape everything in your CSV, say by replacing ‘\n’ with ‘\\n’ and then use csv_find or csv_get like this: csv_find(<file expression>, REPLACE(<key expression>, ‘\n‘, ‘\\n’)) Some MySQL APIs (at least JDBC) treat results of an UDF as binary data. You have to explicitly cast the return value of csv_get like this: SELECT CAST(csv_get(<file expression>, <key expression>, <column expression>) AS CHAR) For more information take look into the source code documentation. Usage patterns other than integrating with Hadoop We use csv_find and csv_get not only to integrate with Hadoop but to integrate multiple MySQL servers. To make data from one MySQL server available in another you could export it like this: SELECT * FROM customer WHERE <some condition> ORDER BY BINARY id ASC INTO OUTFILE ‘/tmp/customer.csv’ Then copy the file over to the other MySQL server (or use NFS). Of course you could use FEDERATED storage engine. We decided not to because it has/had some glitches. Another useful application is to replace complicated JOINs or SUBSELECTs. MySQL is good at performing some JOINs but really poor at a lot others, especially SUBSELECTs. A brief performance evaluation First we create a test CSV file: #> for a in $(seq 1000000 2000000); do echo $a >> /tmp/random.csv; done
-
Micro-benchmarking pthread_cond_broadcast()
In my work on group commit for MariaDB, I have the following situation: A group of threads are going to participate in group commit. This means that one of the threads, called the group leader, will run an fsync() for all of them, while the other threads wait. Once the group leader is done, it needs to wake up all of the other threads. The obvious way to do this is to have the group leader call pthread_cond_broadcast() on a condition that the other threads are waiting for with pthread_cond_wait(): bool wakeup= false; pthread_cond_t wakeup_cond; pthread_mutex_t wakeup_mutex Waiter: pthread_mutex_lock(&wakeup_mutex); while (!wakeup) pthread_cond_wait(&wakeup_cond, &wakeup_mutex); pthread_mutex_unlock(&wakeup_mutex); // Continue processing after group commit is now done. Group leader: pthread_mutex_lock(&wakeup_mutex); wakeup= true; pthread_cond_broadcast(&wakeup_cond); pthread_mutex_unlock(&wakeup_mutex); Note the association of the condition with a mutex. This association is inherent in the way pthread condition variables work. The mutex must be locked when calling into pthread_mutex_wait(), and will be obtained again before the call returns. (Check the man page for pthread_cond_wait() for details). Now, when I think about how these condition variables work, something strikes me as somewhat odd. The idea is that the broadcast signals every waiting thread to wake up. However, because of the associated mutex, only one thread will actually be able to wake up; this thread will obtain a lock on the mutex, and all other to-be-awoken threads will now have to wait for this mutex! Only after the first thread releases this mutex will the next thread wakeup holding the mutex, then after releasing the third thread can wake up, and so on. So if we have say 100 threads waiting, the last one will have to wait for the first 99 threads to each be scheduled and each release the mutex, one after the other in a completely serialised fashion. But what I really want is to just let them all run at once in parallel (or at least as many as my machine has spare cores for). There is another way to achieve this, by simply using a separate condition and mutex for each thread, and have the group leader signal each one individually: Waiter: pthread_mutex_lock(&me->wakeup_mutex); while (!me->wakeup) pthread_cond_wait(&me->wakeup_cond, &me->wakeup_mutex); pthread_mutex_unlock(&me->wakeup_mutex); Group leader: for waiter in <all waiters> pthread_mutex_lock(&waiter->wakeup_mutex); waiter->wakeup= true; pthread_cond_signal(&wakeup_cond); pthread_mutex_unlock(&wakeup_mutex); This way, every waiter is free to start running as soon as woken up by the leader; no waiters have to wait for one another. This seems advantageous, especially as number of cores increases (rumours are that 48 core machines are becoming commodity). "Seems" advantageous. But is it really? Let us micro-benchmark it. For this, I start up 5000 threads. Each thread goes to wait on a condition, either a single shared one, or distinct in each thread. The main program then signals the threads to wakeup, either with a single pthread_cond_broadcast(), or with one pthread_cond_signal() per thread. Each thread records the time they woke up, and the main program collects these times and computes how long it took between starting to signal the condition(s) and wakeup of the last thread. (Here is the full C source code for the test program). I ran the program on an Intel quad Core i7 with hyperthreading enabled, the most parallel machine I have easy access to. The results is the following: pthread_cond_broadcast(): 46.9 msec pthread_cond_signal(): 17.6 msec Conclusion: pthread_cond_broadcast() is slower, as I speculated. I would expect the effect to be more pronounced on systems with more cores; it would be interesting if readers with access to such systems could try the test program and comment below on the results.


