A Newbie Guide to Databases

September 21 2015
 


Databases are an incredibly useful tool for storing data. Similar to a library storing books in an organized structure consisting of categories – such as history or art – a database imposes order on your data so that you can quickly construct the information you are looking for. Different types of databases use different structures to organize the data. The kind of database you need to use depends on the nature of your data and how you plan to use it.

A Brief History of Databases

In 1956, IBM introduced hard disk drives that could directly retrieve the information you needed, without having to trawl through all the data stored on the drive. This required data to be organized so that the computer could find and retrieve the correct record. IT giants of the day – IBM and UNIVAC – developed the early database systems.

Structured query language (SQL) is a way of managing information held in databases. It has been very widely used since its development in the 1970s, and it is still a favorite database manipulation tool today. It’s even been extended into Object Query Language (OQL), which you can use to handle databases that contain objects, such as audio files, images, maps, videos and other data that aren’t simply numbers and characters. However, there are ways to deal with databases that go far beyond simple SQL or OQL commands. The name given to methods of database management that don’t use SQL is “NoSQL.”

Types of Databases

Databases come in many forms. Understanding the distinct differences between them will help you to understand how to use each one to store and retrieve your data.

Relational Databases are Useful for Data with a Clear Structure

Relational databases organize data into tables, with joins linking the tables together. For example, consider a company database that includes two tables: one that stores details about each employee, and one which stores information about the various departments. Each row of the first table contains information about one employee, including the number of the department where the employee works. Each row of the second table contains information about a department, including the department name, number, and location. The department number, which appears in both tables, links the tables together into a single database. Thanks to the join between the two tables, you can look up the location of the department where a particular employee works using this database.

 

 

Figure 1 – Relational tables between Employees and Departments

Non-Relational Databases Have a More Flexible Structure

Non-relational databases aren’t based on a series of joined tables like relational databases are. As they do not store data in the traditional rows-and-columns format, non-relational databases are harder to visualize, but they can be useful in some circumstances. For example, because non-relational databases use specialized frameworks to store data, they are often used for big data applications.

Extracting Information from Databases

To unlock the secrets of databases, you need to learn to speak their language. For relational databases, that language is usually SQL – structured query language. Some companies, including Microsoft and Oracle, have their own languages for handling databases. These proprietary languages are still based on the standard version of SQL, so you need to understand the basics of how the language works.

What is SQL?

If you want to look up or edit information stored in a database, you’ll need to write SQL scripts that tell your computer what you want to do with the data.

SQL is useful for retrieving or updating information in a relational database. You can also use SQL commands to manipulate data in a Microsoft Access database – technically the data storage in this system is not relational, but you do not need to worry

SQL vs. NoSQL: Pros and Cons

NoSQL has several advantages over SQL. NoSQL databases do not have to contain information in every row or every column. It is also easier to quickly add information to a NoSQL database.

When you are building a relational database, you first need to define the schema – the structure of the database. For example, if you want to create a database that stores customer data, such as names, phone numbers and email addresses, then you need to define these categories in the database before you start adding data to it. If you later decide that you want to add some more categories, such as customers’ usual orders, then you will need to define a new schema that includes those categories. You then have to migrate all the data in the existing database to the new schema, which for large databases can take a long time. This can lead to downtime, during which your customers cannot access functions on your site that depend on the database. Though such periods are necessary for upgrading your service, they can also damage your business’s reputation with the public.

NoSQL databases instead use dynamic schema. You can insert data into this kind of database without having to stop and redefine the database’s structure or cause any downtime. If you do not know what kind of data you’ll need to add into your database, then a NoSQL database could be a convenient choice.

 Source: http://db-engines.com/en/ranking

Business Needs determine Scalability Needs

NoSQL databases scale more conveniently than their SQL counterparts. For relational databases, scaling is vertical. That means that as you add more data to create a larger database, you need a bigger server to handle it, and those larger servers can be expensive. It is technically possible to scale a relational database across multiple servers instead (horizontal scaling), but it is tricky to set up.

Horizontal scaling is a breeze with NoSQL databases. That means you can spread your database across multiple servers, which gives you the option of using a cheap cloud server rather than a single dedicated server.

What’s the History of NoSQL?

NoSQL had been around since 2009 when developers met to discuss the possibility of using open-source, non-relational databases to manage complex collections of online data. Since that time, four main types of NoSQL databases have been developed: key value, column-oriented, document stored and graph-based.

Which Database Server Should You Use?

There are various types of SQL and NoSQL database servers, each with its set of use cases.

  • MySQL: As one of the most popular database servers, MySQL is an easy option for developers who are new to using databases. There are many tools and libraries to help you achieve whatever you want to use. Many major industry players use MySQL, including Verizon Wireless and Zyme.
  • PostgreSQL: PostgreSQL is an advanced, open-source database management system. It’s not as easy to get started with as MySQL, but it’s the go-to solution if you need your database to perform elaborate procedures, or you need to migrate it to another solution, such as Oracle, in the future. PostgreSQL is used by many major companies, such as Skype.
  • MSSQL: The Microsoft SQL server is useful for data warehousing and analytics. The industry penetration of MSSQL is wide; if you use a Microsoft server, then it is likely to be running the Microsoft SQL Server

In addition to the SQL databases listed above, four main types of NoSQL databases have been developed:

  • Key Value: A key-value database has a hash table containing keys to particular items of data so that they can be easily retrieved. They are a good choice for large databases, particularly when you need to be able to perform quickly small read and write operations and are simple to implement. Oracle BDB is an example of a key value database.
  • Column Oriented: If you need to store immense amounts of data distributed between many machines, then a column-oriented database is an excellent choice.
  • Document Stored: A document stored database is like a key-value database taken to the next level, as you can associate several nested values with each key. They are great for allowing you to efficiently query the database.
  • Graph Based: Able to store data across multiple machines, a graph-based database has a much more flexible structure than the traditional structure of tables made up or rows and columns. This type of database is useful in social networking applications.

Is NoSQL the Future of Databases?

Despite the name, the aim of NoSQL is not to consign SQL to the history books. In fact, NoSQL stands for “not only SQL”. There are still plenty of applications where SQL is useful, such as those where you know that the structure of the data will not change. However, in situations where the amount of data you have to handle grows rapidly and could require a change in the database structure, a NoSQL database is the right choice.

NoSQL offers the flexibility that modern businesses need to help them manage massive amounts of complex data. As big data becomes increasingly important in a broad range of industries, NoSQL databases provide a format in which many different types of data can be stored and analyzed, allowing companies to gain insight into all aspects of their business.

Interested in learning more about database performance and monitoring? Read more here.

Omed Habib
Omed Habib is a Director of Product Marketing at AppDynamics. He originally joined AppDynamics as a Principal Product Manager to lead the development of their world-class PHP, Node.js and Python APM agents. An engineer at heart, Omed fell in love with web-scale architecture while directing technology throughout his career. He spends his time exploring new ways to help some of the largest software deployments in the world meet their performance needs.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form