Dealing with rejection in git, resetting your master to theirs and starting over

You haven’t really started using git until you’ve gotten your first patch rejected. Due to the nature of git, merging is easier, but rejection is harder. Well if I was using branches it wouldn’t be that big of a deal, but thats besides the point.

You see I developed a love/hate relationship with a wonderful NoSQL database called MongoDB, that led me to contribute patches to it. I loved it because its awesome for a few specific tasks I need to do, but found its windows support a little lacking. It ran and performed great in windows, and even ran as a proper NT Service. However, it needed a little spit and polish. It was also a great excuse to do some hardcore windows system programming, and logging calls. Ask any sysadmin that ever had to support a program I wrote, I love verbose debug logs.

So every once in a while one of my pull requests got rejected. The first few times it took me a while to deal with it. I don’t mean it took me a while emotionally to deal with the fact that my code is anything but perfect. I mean that it took me a while to reset my git repo with the main 10gen repo. Sure I could have deleted it and started over, but I wanted to learn how to use git properly.

So in the end I asked on the Long Island Linux User Group’s mailing list and got a helpful reply from Mark Drago. I actually linked to my reply to his reply, since I make a correction. So without further ado, here what I did.

Ok a little more ado. My repo is refereed to as origin in the config and 10gen’s is referred to as 10gen. I wanted to save my master branch’s current state to a newly named branch delete my master and pull from 10gen’s master. In the end my problem was I never did a git fetch.  The final sequence of commands I did type was:</ado>

  • git checkout master
  • git checkout -b oldmaster
  • git branch -d master
  • git fetch 10gen  #Needed to get the list of branches in master.
  • git checkout remotes/10gen/master
  • git checkout -b master
  • git push -f origin master

And that was it. I was free to continue to write code.

A really simple guide to creating SQL Server Table with a FILESTREAM backed column

I recently helped to develop a small website to share images. I decided to use the new FILESTREAM feature of SQL Server 2008 to store the images. Along the way I encountered some gotchas, so I wrote this article to document them here.

Usually, I am the last person to want to store files in a relational database. Windows comes with a datastore highly optimized for storage of arbitrary blobs of binary data called the NTFS filesystem. However, SQL Server 2008 has a really nice feature where you can store the content of varbinary fields directly on the filesystem. This is a best of both worlds approach. Each blob in the varbinary column of your table is stored in a separate file on your filesystem. However, you can access the contents of this file through a SQL query.

Example downloads

The complete example can be found here. The zip archive contains both the SQL script to create an example database, and a sql backup file you can restore on your system.

Enabling FILESTREAM on your SQL Server instance.

There are two steps to do this. The first is enabling it either at install time via the installer, or via a vbscript after the fact. The second is turning it on via a system stored procedure.

Enabling After the Fact

You probably did not enable filestream support when you installed SQL Server 2008, unless you went out of your way to enable every feature. If you did already enable it, running this script won’t hurt anything.

The vbscript is available via the Microsoft SQL Server Engine Code Sample Project on Codeplex. That page does a good job of documenting all the command line arguments. The main concern most developers using this will have is the named instance of the SQL server you wish to enable filestream support on. Most developers trying this out are doing it from their machine. They are probably using the express edition of SQL Server 2008 which has a default instance name of SQLEXPRESS. The script defaults to attempting to enable filestream support on the named instance MSSQLSERVER. Therefore you will most likely have to specify the option “/Instance:SQLEXPRESS”.

So once you verify the named instance you want to enable filestream support on, you run the following command:

cscript filestream_enable.vbs [/Instance:InstanceName]

That was simple. Now we move on to turning on filestream support.

Turning on FILESTREAM support

Ok now fire up sql management studio or sqlcmd and execute the following:

EXEC sp_configure filestream_access_level, 2
RECONFIGURE
GO

Once again, you can run this as many times as you want without negative side effects.

Enabling FILESTREAM on Your Database

Now filestream is completely enabled on your sql server instance. However, you have to enable it on your database. We do this by adding a filestream FileGroup. In a SQL Server database, all the data and log files, .mdf and .ldf files respectively are part of one or more filegroups. With filestreams, you have a special filegroup that is a collection of directories that contain the filestream files.

Many application developers are ignorant of the specifics of sql server files and filegroups. We take these for granted because we can just type “CREATE DATABASE foo” and we get a data file and log file with sane enough defaults created automatically. We then let the DBA figure out how to setup the filegroups on staging, UAT, and production. However, we need to get down and dirty when we want to use a filestream. Luckily, with a little dynamic SQL, we can create a filestream filegroup and file without thinking too much.

When making a data file in sql server, even if that file is really a directory containing filestream data, you have to specify an absolute path. I prefer the convention of storing my filestreams alongside my mdf and ldf files. On the SQL Server 2008 R2 Express instance on my laptop, that is “C:Program FilesMicrosoft SQL ServerMSSQL10_50.SQLEXPRESS2K8R2MSSQLDATA.” However, the exact data folder is different per each instance of SQL server. After some research, I took Walden’s slightly hackish suggestion of using the sysfiles table. This suggestion assumes you did not customize your files and filegroups in your CREATE DATABASE statement.

DECLARE @FileStreamPath VARCHAR(MAX)
SELECT
@FileStreamPath=REPLACE(fileName, DB_NAME() + '.mdf', 'FileStreamPhotos')
FROM sysfiles
WHERE fileid=1
DECLARE @sqlString varchar(max)
SET @sqlString =
'ALTER DATABASE [dbWithFileStream] ADD FILE
(
NAME = PhotoStore,
FILENAME = ''' + @FileStreamPath + '''
) TO FILEGROUP FileStreamGroup_Photos'
EXEC sp_sqlexec @sqlString
GO

Its dirty but it works.

Creating Your FILESTREAM stored VARBINARY column

There’s no point to a filestream if your not going to store VARBINARY columns in it. To store a varbinary column in a filestream, All we have to do is add the keyword FILESTREAM to our column definition, as illustrated below.

CREATE TABLE photos (
id UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL
CONSTRAINT PK_photos PRIMARY KEY NONCLUSTERED
CONSTRAINT DF_photos_id DEFAULT NEWID(),
photo VARBINARY(MAX) <span style="color: #ff0000;"><strong>FILESTREAM</strong></span> NOT NULL
GO

Getting data into and out of this column is another topic for another article. For now know that all methods for getting data in and out of normal VARBINARY columns works, plus new better ones.

Conclusions

FILESTREAM is certainly “ready for prime time,” in that it is safe and performs well. However, the feature feels very 1.0-ish. My adventures in getting it enabled were mostly due to not knowing there were two distinct steps to enabling it. If I needed to store a large amount of binary data for a project in the future, I’d certainly consider using SQL server filestreams again. However, I’d probably lean towards writing the files to the filesystem myself, and store the path to the file in a varchar column.