The Acumen software design blog provides you with helpful articles about SQL.

Several web sites provide a good overview.

Official Microsoft SQL Server Training Overview

SQL Wikipedia

NoSQL’s popularity has been on the rise over the last five years, with favorites like MongoDB, CouchDB, Cassandra, and Redis on the forefront. With NoSQL’s strengths in its fantastic performance and the ability to store and query denormalized data sets, there are a number of reasons to use NoSQL over SQL technologies like MySQL or Microsoft SQL Server.

Denormalized Data Structures

NoSQL allows developers to store denormalized data structures in document form. While many find this more enjoyable and easier to work with, there is a bit of an adjustment for someone come from a structured SQL background. For example, take a look at the following documents inside of the same MongoDB collection:

{
	firstName: "Kerry",
	lastName: "Ritter",
	city: "St. Louis",
	state: "Missouri",
	serverExperience: [
		"Apache",
		"IIS"
	],
	programmingLanguages: [
		"PHP",
		"C#",
		"JavaScript"
	]
}
{
	firstName: "Rob",
	lastName: "Wagnon",
	city: "St. Louis",
	state: "Missouri",
	serverExperience: [
		"Apache",
		"Nginx",
		"IIS"
	],
	databaseExperience: [
		"Microsoft SQL Server",
		"MySQL"
	]
}
{
	firstName: "Dave",
	lastName: "Mueller",
	city: "St. Louis",
	state: "Missouri",
	serverExperience: [
		"Apache",
		"Nginx",
		"IIS"
	],
	programmingLanguages: [
		"PHP",
		"C#"
	]
}

A similar normalized database structure similar to the following 7 tables:

Person (ID, FirstName, LastName, State)
ProgrammingLanguages (ID, Title)
ServerTypes (ID, Title)
DatabaseTypes (ID, Title)
Person_ProgrammingLanguages (PersonID, ProgrammingLanguageID)
Person_ServerExperience (PersonID, ServerTypeID)
Person_DatabaseExperience (PersonID, DatabaseTypeID)

So now, for just a basic structure, we have to put some effort in to create these tables and their relationships, while using MongoDB prevents this by allowing a denormalized document storage method.

The amount of tables grows considerably the more variable your data becomes; if we add a collegeDegreesObtained array to list the degrees held by each person, we would have to create two more tables: to manage the types of degrees and the connecting table between the degree and person. In NoSQL, we simply add an array to the document and we are done. We do not risk breaking any queries, having errors from NULL data, or having to manage default column values.

Key points:

  • Less work overhead for one-to-many relationships
  • Less work overhead for adding new data to an entity

Querying

Querying in MongoDB also tends to be a little simpler when doing the more basic lookups. For example, to find someone with the first name Kerry, we do the following:

db.people.find({ firstName: "Kerry" })

This translates to the following in SQL:

SELECT * FROM Person WHERE FirstName = "Kerry"

Not much difference in the amount of work. However, if we want to find someone with knowledge of the PHP scripting language and the Nginx server, we would do the following in NoSQL:

db.people.find({ programmingLanguages: "PHP", serverExperience: "Nginx" })

This would translate to the following query in SQL:

SELECT * FROM Person p
INNER JOIN Person_ProgrammingLanguages ppl ON p.ID = ppl.PersonID
INNER JOIN ProgrammingLanguages pl on ppl.ProgrammingLanguageID = pl.ID
INNER JOIN Person_ServerExperience pse ON p.ID = pse.PersonID
INNER JOIN ServerTypes se on pse.ServerTypeID = se.ID
WHERE pl.Title = "PHP" AND se.Title = "Nginx"

This example demonstrates the simplicity of querying MongoDB documents in comparison to querying complex relational data in MySQL. As stated before, SQL query size and overhead will grow much faster with new data, while MongoDB queries and documents will still relatively small and manageable.

Key points:

  • Simpler querying mechanism when searching for relational data (in SQL) values

Performance

There are a number of benchmark studies out there demonstrating the speeds of various NoSQL implementations vs SQL implementations. While we do not have any formalized studies, I did some testing personally and found that when inserting millions of documents into MongoDB and inserting a corresponding row into Microsoft SQL Server table, MongoDB took under half of the time. Querying was a similar story; MongoDB cut about half of the time to find a piece of data in a very large dataset. However, MongoDB eats up a lot of memory, so make sure to be cautious of that.

If you are not sure whether NoSQL will offer you performance advantages, simply search the web for comparisons and you will find a number of them.

Key points:

  • NoSQL typically offers better performance and speed
  • NoSQL tends to use and require a lot of RAM, so be cautious

Drawbacks

There are some drawbacks to using NoSQL; it is not a perfect solution to everything. Each implementation has its own issues. In our development of a genealogical search, we used MongoDB and found these issues.

Disk space consumption: MongoDB tends to take up a lot of space in comparison to the amount of data. There are some solutions to this problem (TokuMX, which will be discussed in a later blog), but MongoDB itself isn’t the most space-efficient solutions.

Document size restriction: The max size of a MongoDB document is 16MB. This was a particularly complex issue to work around if your document is a company with an array of subdocuments containing information on employees. If that list gets too large, you will have to split the company document and then re-connect them with an aggregation pipeline.

Pagination using skips and limits have very bad performance: Say your company has 10,000 people subdocuments. To skip the first 9000 is a serious performance hit as it seems to run your query parameters against each row. This is a completely ineffective method for paginating large datasets, but there are some workarounds, such as using a $gte parameter on the last page’s item ID.

When to use NoSQL

While this choice is always up to the project’s needs and the developers, NoSQL should very much be considered if the data is going to be very relational. Also, NoSQL can have some great performance benefits when used correctly. If you’ve never worked with NoSQL before, its barrier to entry is relatively low and implementations like MongoDB have great communities who can help you when you’re stuck.

When we use the word “schema”, we are usually referring to the overall design definitions of tables, views, stored procedures and other objects in a database. Within Microsoft SQL Server, however, there is an object called “schema” which permits other objects (such as tables, views and stored procedures) to be logically grouped together. Every table must belong to a schema, and by default, the default schema name for objects in SQL Server is “dbo.” 

When one is confronted with a list of tables in a database and all of the tables are in a single schema, it is easy to get frustrated when viewing a list of tables in the database. The list can contain several hundred table names, and searching to finding what one wants can be awkward.

In the past, designers of databases in Microsoft SQL Server almost always let all objects remain in the “dbo” schema, and according to the rules of T-SQL (Microsoft’s flavor of SQL), if, say, a table were referenced by its table name alone without including the schema name, it was assumed, even by the SQL Server query engine, to be in the dbo schema. Referring to the table Clients was the same as referring to the table dbo.Clients. There was no other schema than the dbo schema, and there was no other table named Clients.  This works well for databases with few objects, such as tables, in them.

In early versions of Microsoft SQL Server, the schema name could only be the name of a database user. Users could not be removed if there were objects in the schema having that user’s name, and this did not work out well in practice. If database designers used schemas other than dbo (not very common), imaginary users were created so that there was no undesirable linkage between users and schemas. So there were users named Mary and Joe, but also users name Purchasing. No one would ever log in as the Purchasing user, but the user/schema name was used to group objects together which primarily dealt with purchasing.

By creating schemas with meaningful names, it is possible to group tables in clusters, often representing the same clusters of tables that appear in database diagrams that show tables as they relate to each other through referential integrity. 

It is possible to grant security rights by schema, and often this turns out to be a logical classification on which to grant or deny such rights. 

Tearing apart and reconstructing an existing database to group tables into schemas is not, in my opinion, something that should necessarily be done; however, these concepts should be born in mind for future database development work. 

SQL Server 

Description:

We have encountered the unfortunate situation where a company has a third-party application and the performance problems associated with this application cannot be solved despite many in-house efforts to do so and many attempts to get cooperation from the vendor in troubleshooting the root cause of the problem.

The vendor has often come back with demands for additional or more powerful hardware or greater network bandwidth where none of the suggestions fix the problem. The company using the software doesn’t know what to do next, especially when the software product is well-known in the industry or is from a major software vendor. Customers don’t know where to go next to seek solutions when the vendor will not cooperate.

When a problem is addressed without cooperation from the third-party vendor, the vendor will usually no longer support the application. This is not an acceptable solution even if solutions are found. The vendor must be persuaded to help in order to effect a real solution. The customer is held back by its own lack of confidence in dealing with the problem to demand a solution.

Many third-party vendors hide behind flimsy policies regarding customer information to refuse to share information they have gathered from other customers which would clearly show that the problem is with their software. They use excuses and tactics which can often be easily disproven without their cooperation. For example, third party vendors often assert that a customer is using their product in a way that is different than other customers; even if there is a problem, the vendor cannot fix the problem without adversely affecting other customers. I have often found that this is nonsense.

SQL Server Problem:

The problem starts when the vendor allocates resources for development of their product. The vendor has or acquires a great deal of knowledge of the business process necessary for the product and designs a great user interface to implement the required functionality.

Because there isn’t a lot to “see” from good data architecture work during the development phase, the vendor has never seriously considered the database as an important part of the application requiring expert resources for design or optimization.

This is true even with Microsoft, where the SharePoint product clearly has many continuing design and tuning issues. (Microsoft continually gets feedback from within the company on this product and in a mind-blowing use of compartmentalization and denial continues to refuse to address these issues.)

Some companies can be persuaded or embarrassed into taking these issues seriously when presented with undeniable evidence and customers who refuse to accept nonsense for excuses.

There are many things that can be done even after deployment. These remedial actions can be implemented as a result of experienced analysis and interpretation of performance monitor traces, results from dynamic management views and SQL Server Profiler traces. Acumen has the experience to do this, and to teach you to do this as well.

As a result of this an engagement where we analyze the use of SQL Server in a third party application, Acumen can provide the following, where appropriate.

Steps:

  1. A written report targeted to a particular application that proves that problems with the application are harmfully affecting performance, if such is the case.
  2. Knowledge transfer from Acumen to you so that you have confidence to pressure the vendor to provide a fix and evidence that the required fixes will not negatively affect performance.
  3. An Acumen consultant can participate in conference calls (or in-person visits) where Acumen becomes your advocate for receiving fixes to your legitimate issues.
  4. Occasionally, Acumen may have previously experiences with this software to defeat claims or implications from the vendor that the problem is yours alone.

Building additional SQL views in Aptify Association and Membership Management Software can sometimes require you to use SQL temporary tables in your view.

If you then attempt to create a database object from said view:

You will get an Aptify Exception error that will include “Views or functions are not allowed on temporary tables. Table names that begin with ‘#’ denote temporary tables.”

To get around this problem:

Take the T-SQL involving the temp table and port it into a table-based function that is called by the view.

The issue comes from the way Aptify enumerates everything returned by the SELECT statement and cannot handle temp tables in a view. Functions don’t have the same restrictions, though, and can be used to return the needed data.

See our Network Page or Contact Us to find more ways we can help you!