I recently helped to develop a small website to share images. I decided to use the new FILESTREAM feature of SQL Server 2008 to store the images. Along the way I encountered some gotchas, so I wrote this article to document them here.
Usually, I am the last person to want to store files in a relational database. Windows comes with a datastore highly optimized for storage of arbitrary blobs of binary data called the NTFS filesystem. However, SQL Server 2008 has a really nice feature where you can store the content of varbinary fields directly on the filesystem. This is a best of both worlds approach. Each blob in the varbinary column of your table is stored in a separate file on your filesystem. However, you can access the contents of this file through a SQL query.
The complete example can be found here. The zip archive contains both the SQL script to create an example database, and a sql backup file you can restore on your system.
Enabling FILESTREAM on your SQL Server instance.
There are two steps to do this. The first is enabling it either at install time via the installer, or via a vbscript after the fact. The second is turning it on via a system stored procedure.
Enabling After the Fact
You probably did not enable filestream support when you installed SQL Server 2008, unless you went out of your way to enable every feature. If you did already enable it, running this script won’t hurt anything.
The vbscript is available via the Microsoft SQL Server Engine Code Sample Project on Codeplex. That page does a good job of documenting all the command line arguments. The main concern most developers using this will have is the named instance of the SQL server you wish to enable filestream support on. Most developers trying this out are doing it from their machine. They are probably using the express edition of SQL Server 2008 which has a default instance name of SQLEXPRESS. The script defaults to attempting to enable filestream support on the named instance MSSQLSERVER. Therefore you will most likely have to specify the option “/Instance:SQLEXPRESS”.
So once you verify the named instance you want to enable filestream support on, you run the following command:
cscript filestream_enable.vbs [/Instance:InstanceName]
That was simple. Now we move on to turning on filestream support.
Turning on FILESTREAM support
Ok now fire up sql management studio or sqlcmd and execute the following:
EXEC sp_configure filestream_access_level, 2 RECONFIGURE GO
Once again, you can run this as many times as you want without negative side effects.
Enabling FILESTREAM on Your Database
Now filestream is completely enabled on your sql server instance. However, you have to enable it on your database. We do this by adding a filestream FileGroup. In a SQL Server database, all the data and log files, .mdf and .ldf files respectively are part of one or more filegroups. With filestreams, you have a special filegroup that is a collection of directories that contain the filestream files.
Many application developers are ignorant of the specifics of sql server files and filegroups. We take these for granted because we can just type “CREATE DATABASE foo” and we get a data file and log file with sane enough defaults created automatically. We then let the DBA figure out how to setup the filegroups on staging, UAT, and production. However, we need to get down and dirty when we want to use a filestream. Luckily, with a little dynamic SQL, we can create a filestream filegroup and file without thinking too much.
When making a data file in sql server, even if that file is really a directory containing filestream data, you have to specify an absolute path. I prefer the convention of storing my filestreams alongside my mdf and ldf files. On the SQL Server 2008 R2 Express instance on my laptop, that is “C:Program FilesMicrosoft SQL ServerMSSQL10_50.SQLEXPRESS2K8R2MSSQLDATA.” However, the exact data folder is different per each instance of SQL server. After some research, I took Walden’s slightly hackish suggestion of using the sysfiles table. This suggestion assumes you did not customize your files and filegroups in your CREATE DATABASE statement.
DECLARE @FileStreamPath VARCHAR(MAX) SELECT @FileStreamPath=REPLACE(fileName, DB_NAME() + '.mdf', 'FileStreamPhotos') FROM sysfiles WHERE fileid=1
DECLARE @sqlString varchar(max) SET @sqlString = 'ALTER DATABASE [dbWithFileStream] ADD FILE ( NAME = PhotoStore, FILENAME = ''' + @FileStreamPath + ''' ) TO FILEGROUP FileStreamGroup_Photos'
EXEC sp_sqlexec @sqlString GO
Its dirty but it works.
Creating Your FILESTREAM stored VARBINARY column
There’s no point to a filestream if your not going to store VARBINARY columns in it. To store a varbinary column in a filestream, All we have to do is add the keyword FILESTREAM to our column definition, as illustrated below.
CREATE TABLE photos ( id UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL CONSTRAINT PK_photos PRIMARY KEY NONCLUSTERED CONSTRAINT DF_photos_id DEFAULT NEWID(), photo VARBINARY(MAX) <span style="color: #ff0000;"><strong>FILESTREAM</strong></span> NOT NULL GO
Getting data into and out of this column is another topic for another article. For now know that all methods for getting data in and out of normal VARBINARY columns works, plus new better ones.
FILESTREAM is certainly “ready for prime time,” in that it is safe and performs well. However, the feature feels very 1.0-ish. My adventures in getting it enabled were mostly due to not knowing there were two distinct steps to enabling it. If I needed to store a large amount of binary data for a project in the future, I’d certainly consider using SQL server filestreams again. However, I’d probably lean towards writing the files to the filesystem myself, and store the path to the file in a varchar column.