The Acumen software design blog provides you with helpful articles about MySQL.

Several web sites provide a good overview.

Official MySQL Services Page

MySQL Wikipedia

NoSQL’s popularity has been on the rise over the last five years, with favorites like MongoDB, CouchDB, Cassandra, and Redis on the forefront. With NoSQL’s strengths in its fantastic performance and the ability to store and query denormalized data sets, there are a number of reasons to use NoSQL over SQL technologies like MySQL or Microsoft SQL Server.

Denormalized Data Structures

NoSQL allows developers to store denormalized data structures in document form. While many find this more enjoyable and easier to work with, there is a bit of an adjustment for someone come from a structured SQL background. For example, take a look at the following documents inside of the same MongoDB collection:

{
	firstName: "Kerry",
	lastName: "Ritter",
	city: "St. Louis",
	state: "Missouri",
	serverExperience: [
		"Apache",
		"IIS"
	],
	programmingLanguages: [
		"PHP",
		"C#",
		"JavaScript"
	]
}
{
	firstName: "Rob",
	lastName: "Wagnon",
	city: "St. Louis",
	state: "Missouri",
	serverExperience: [
		"Apache",
		"Nginx",
		"IIS"
	],
	databaseExperience: [
		"Microsoft SQL Server",
		"MySQL"
	]
}
{
	firstName: "Dave",
	lastName: "Mueller",
	city: "St. Louis",
	state: "Missouri",
	serverExperience: [
		"Apache",
		"Nginx",
		"IIS"
	],
	programmingLanguages: [
		"PHP",
		"C#"
	]
}

A similar normalized database structure similar to the following 7 tables:

Person (ID, FirstName, LastName, State)
ProgrammingLanguages (ID, Title)
ServerTypes (ID, Title)
DatabaseTypes (ID, Title)
Person_ProgrammingLanguages (PersonID, ProgrammingLanguageID)
Person_ServerExperience (PersonID, ServerTypeID)
Person_DatabaseExperience (PersonID, DatabaseTypeID)

So now, for just a basic structure, we have to put some effort in to create these tables and their relationships, while using MongoDB prevents this by allowing a denormalized document storage method.

The amount of tables grows considerably the more variable your data becomes; if we add a collegeDegreesObtained array to list the degrees held by each person, we would have to create two more tables: to manage the types of degrees and the connecting table between the degree and person. In NoSQL, we simply add an array to the document and we are done. We do not risk breaking any queries, having errors from NULL data, or having to manage default column values.

Key points:

  • Less work overhead for one-to-many relationships
  • Less work overhead for adding new data to an entity

Querying

Querying in MongoDB also tends to be a little simpler when doing the more basic lookups. For example, to find someone with the first name Kerry, we do the following:

db.people.find({ firstName: "Kerry" })

This translates to the following in SQL:

SELECT * FROM Person WHERE FirstName = "Kerry"

Not much difference in the amount of work. However, if we want to find someone with knowledge of the PHP scripting language and the Nginx server, we would do the following in NoSQL:

db.people.find({ programmingLanguages: "PHP", serverExperience: "Nginx" })

This would translate to the following query in SQL:

SELECT * FROM Person p
INNER JOIN Person_ProgrammingLanguages ppl ON p.ID = ppl.PersonID
INNER JOIN ProgrammingLanguages pl on ppl.ProgrammingLanguageID = pl.ID
INNER JOIN Person_ServerExperience pse ON p.ID = pse.PersonID
INNER JOIN ServerTypes se on pse.ServerTypeID = se.ID
WHERE pl.Title = "PHP" AND se.Title = "Nginx"

This example demonstrates the simplicity of querying MongoDB documents in comparison to querying complex relational data in MySQL. As stated before, SQL query size and overhead will grow much faster with new data, while MongoDB queries and documents will still relatively small and manageable.

Key points:

  • Simpler querying mechanism when searching for relational data (in SQL) values

Performance

There are a number of benchmark studies out there demonstrating the speeds of various NoSQL implementations vs SQL implementations. While we do not have any formalized studies, I did some testing personally and found that when inserting millions of documents into MongoDB and inserting a corresponding row into Microsoft SQL Server table, MongoDB took under half of the time. Querying was a similar story; MongoDB cut about half of the time to find a piece of data in a very large dataset. However, MongoDB eats up a lot of memory, so make sure to be cautious of that.

If you are not sure whether NoSQL will offer you performance advantages, simply search the web for comparisons and you will find a number of them.

Key points:

  • NoSQL typically offers better performance and speed
  • NoSQL tends to use and require a lot of RAM, so be cautious

Drawbacks

There are some drawbacks to using NoSQL; it is not a perfect solution to everything. Each implementation has its own issues. In our development of a genealogical search, we used MongoDB and found these issues.

Disk space consumption: MongoDB tends to take up a lot of space in comparison to the amount of data. There are some solutions to this problem (TokuMX, which will be discussed in a later blog), but MongoDB itself isn’t the most space-efficient solutions.

Document size restriction: The max size of a MongoDB document is 16MB. This was a particularly complex issue to work around if your document is a company with an array of subdocuments containing information on employees. If that list gets too large, you will have to split the company document and then re-connect them with an aggregation pipeline.

Pagination using skips and limits have very bad performance: Say your company has 10,000 people subdocuments. To skip the first 9000 is a serious performance hit as it seems to run your query parameters against each row. This is a completely ineffective method for paginating large datasets, but there are some workarounds, such as using a $gte parameter on the last page’s item ID.

When to use NoSQL

While this choice is always up to the project’s needs and the developers, NoSQL should very much be considered if the data is going to be very relational. Also, NoSQL can have some great performance benefits when used correctly. If you’ve never worked with NoSQL before, its barrier to entry is relatively low and implementations like MongoDB have great communities who can help you when you’re stuck.

Our PHP development team encountered many situations where remote connections to mysql take 5 seconds to return responses.  In some cases, we’ve rebuilt servers after tearing our hair out.  This has happened on Windows, Linux, PHP, apache, WordPress, joomla, Zabbix, and so many cases it makes me want to scream.  Of course, people love to point to Windows as the problem.  Lots of other people say optimize mysql.  Others say to configure your long_query_time parameter and check the logs.  None of this matters, because it isn’t Windows, it isn’t Linux, it isn’t PHP, it isn’t apache, it isn’t WordPress, it isn’t Zabbix, and it isn’t an optimization issue.

It is a reverse DNS problem.  Not a DNS problem, a reverse DNS problem.  A reverse DNS problem on your mysql server.  Which is absurd.  Why on earth would mysql server check for reverse DNS?  Why?

Because mysql wants to look at the host name for GRANT lookups.  Jeez.  Who cares.  I’d rather just fail the security than have all my applications wait five or ten seconds for every operation.

Solution

You can do one of two things to fix this:

1. Work with your network administrator to create reverse DNS PTR records.  Maybe you can get this done, maybe you can’t.

1b.  Create a host entry in the hosts (/etc/hosts or c:\windows\system32\drivers\etc\hosts).  This is good for both forward and reverse DNS.  Just never forget that you made it three years from now when you rename servers.

2.  Configure mysql with skip-name-resolve as described here.  For people that can’t control their reverse DNS PTR records – or can’t get help having them configured – this is the best option.

Want to make this really crazy?  If you have a multiple subnet network work with multiple DNS servers for hundreds of domains and split-horizon DNS for internal networks, you may have the extra fun of configuring conditional forwards for reverse DNS.  If you want to know how to do that, just ask in the comments.  (It’s not hard, so google may have already helped you.  You just have to know you need it.)

My apologies that this is an unusually snarky post.  It’s only consumed 80+ hours cumulative of my life.

Keywords

MySQL, PHP Development