Giving back to #sqlfamily

I’m way late to the T-SQL Tuesday party, but I’d like to add my opinions to giving back to the SQL community. I think most of the other authors have covered the “why,” so I will talk in terms of the more pragmatic “how.”

So how does one give back to the community? Well there are the traditional ways like answering questions on #sqlhelp, and the forums as well as speaking at meetings, conferences and SQL Saturdays. Those are all well and good, but there are a few other ways.

First, give us your code. I’m a developer, not a DBA, and I was a linux admin before I became a .NET developer. Therefore giving back through open source is something I have been “raised” to do. So write some code, consider putting it under an open source license, and distribute it, preferably on github.

Secondly, tweak the free code out there. Is there an open source SQL script that you like? Did you change it? Send the changes to an author. Just be warned that not all the free scripts created by members of this community are open source, and not all authors will incorporate your changes. For example, Adam Machanic probably won’t accept your changes to sp_whoisactive. Brent Ozar OTOH, will accept a sp_blitz (which is not open source) patch. BTW you can thank me for being able to save output from that script to global (##prefixed) temp tables. Olla Halgreen will accept patches for his maintenance scripts, and Richie Rump will accept pull requests for his statistics visualizer.

Thirdly, curate dba.stackexchange.com. Edit the questions for grammar and spelling. If you ask a question yourself, try to ask it in a way that it becomes a canonical question. Thinking about blogging about something. Instead, considering self answering a question their. I’ve seen stackoverflow and serverfault greatly improve the level of bingleable development and operations knowledge. The Q&A format is better than forums for things that don’t need to be a discussion. Lets foster that here. BTW, all the stackexchange data is available under a creative commons license. You can download it an query it offline.

Finally, I’d like to take the time to thank everyone in the community that has helped me. I won’t single anyone out, because they’re too many of you. I’ll just try to keep paying it forward.

Microsoft, please open source sqlcmd, Sqlps, SMO, and LogParser

Microsoft has open sourced part of the .NET framework. This is exciting news, although honestly not all that unprecedented. There was a time when AT&T defended the C language as its intellectual property. These days, the language is effectively in the public domain. (Note IANAL, the preceding was not legal advice. UNIX is a trademark of the Open Group). This lead to a few people asking if SQL Server would ever be open sourced. That answer is probably no for the time being. However, I do think certain components of if can be open sourced.

Command Line tools.

sqlcmd is a great command line shell for SQL Server. However, it could be better. The same goes for bcp and the other tools. Microsoft isn’t going to spend the time making sqlcmd read the contents of the EDITOR environment variable if SQLCMDEDITOR is not set to match the behavior of unix command line tools. Its not going to add a simple switch to generate query results as a CSV, or sendout output to the clipboard. Its not going to allow you to colorize RAISERROR messages based on severity. However, members of the community most certainly would do that. Also, I think if sqlcmd got enough pull requests, Microsoft might realize that PowerShell is not the be all and end all of the command line.

SQLPS

By SQLPS, I am referring to all the powershell modules, providers, etc for SQL Server. There are some great cmdlets in that collection. However, can we get Invoke-SqlCmd to do parameter substitution? Yes, I know its unlikely that a PowerShell script will be run by an untrusted user, and if it was, that user probably could already run arbitrary SQL. However, as a matter of best practice and acknowledgement that there are third party solutions to serve web pages by PowerShell, one should be allowed to parametrize their queries. Invoke-SqlCmd2, written by Chad Miller and contributed to by others including myself, allows you to do this. Wouldn’t it be nice if a future version of the builtin cmdlet had the same feature? If it was open source, I swear on my honor as a developer, I’d submit the patch.

SMO

Sql Server Management Objects is a bit of a legacy solution. Hwoever, it still has valid use cases. It provides amongst other things a COM API for backing up and restoring databases. I don’t actually use SMO, but many DBAs do. I might some day. I’m sure developers that USE SMO has patches they’d like to

LogParser

LogParser is actually not part of the SQL Server code base, but its used by SQL Server DBAs. LogParser claims to be intended for dealing with Log files, and it indeed lets you sql SQL queries on IIS log files and other text formats. However, it also lets you read and write from SQL Server using the bulk copy mechanisms. Its a great ETL tool. It exposes a COM object, and can be automated with Powershell. I’ve asked Scott Hanselman about this, and he says the main thing Microsoft needs is an internal sponsor to make this happen.

If LogParser was open source, a proper .NET layer could be added to it, support for more data sources, and the SQL syntax could be improved.

Conclusion

Will Microsoft Open source any of these tools? I think they could safely do so without hurting their current revenue streams. Having these tools open sourced would benefit people who use SQL Server, but not in a manner that would allow people to, for example, use Express instead of Standard Edition or Standard instead of Express. It would not hurt the revenue streams. It would require them to dedicate some resources to the Sql Server client tools. I think these tools need some love anyway, and those resources should be dedicated.

The case for open sourcing the SQL Saturday Website

My name is Justin Dearing. I write software for a living. I also write software for free as hobby and for personal development. When I’m not writing code, I speak at user groups, events and conferences about code and code related topics. Once such event is SQL Saturday. I haven’t spoken in a while because I became a dad in June. However, my daughter is 9 months old now and the weather is warm. I feel comfortable attending a regional SQL Saturday or two. So last night I submitted to SQL Saturday Philadelphia. The submission process (I mean the mechanical process of using the website to submit my abstract) was annoying, as usual. What really got me going though was when I realized two things:

  • My newlines were not being preserved so that my asterisks that were supposed to punctuate bullet points were not at the beginnings of lines.
  • I could not edit my submission once submitted.

I like bullet points, a lot. However, I digress. In response to my anger, I complained on twitter that the site should be open sourced, so I the end user could create a better experience for myself and my fellow SQL Saturday Speakers.

I got three retweets. At least I wasn’t completely alone in my sentiment. I complained again in the morning, started a conversation and eventually Tim sent this out this:

So the site was being rewritten, but it would not be open sourced. Should I have been happy at that point, or at least patiently await the changes? One could presume that session editing and submission would be improved. At the very least, things would get progressively better as there were revisions to the code. If the federal government could pull off the ObamaCare site, with some hiccups, why can’t a group of DBAs launch a much smaller website, with much simpler requirements and lower load? I’d be willing to bet they will. I’d be willing to bet that this site will suck a lot less than the old site, and that it will continue to progress. I’m sure smart people are working on it, and a passionate BoD are guiding the process. At the very least I’ll withhold judgement until the new site is live. Despite my confidence in the skills of the unknown (to me) parties working on the site, there are so many hours in the day and only so many things a team of finite size can do. However, a sizable minority of PASS’s membership are .NET developers. Many of them speak at SQL Saturdays. They have to submit to the site. Some of them will no doubt be annoyed at some aspect of the site. Some of them might fix that annoyance, or scratch their itch in OSS parlance, if the site was open source and there was a process to accept pull requests. I’m not describing a hypothetical nirvana. I’ve seen the process I describe work because I’m submitted a lot of patches to a lot of OSS projects. I’ve submitted a patch to the (not actually open source, as Brent will be the first to state) sp_blitz and Brent accepted it. I’ve contributed to NancyFX. I once contributed a small patch to PHP to make it consume WCF services better. I’ve contributed to several other OSS projects as well. Perhaps your saying SQL Server is a Microsoft product, not some hippie Linux thing. Perhaps you share the same sentiment as Noel McKinney:

However, as I pointed out to Noel, the mothership’s (i.e, Microsoft’s Editors Note: Noel has stated to me he meant Microsoft) beliefs are not anti OSS. Microsoft has fully embraced Open Source. You can become an MVP purely for OSS without any speaking or forum contributions. One of the authors of NancyFX is an example of such a recipient. F#, ASP.NET and Entity Framework are all open source. Just this week Microsoft Open Sourced Roslyn. As a matter of fact I’ve even submitted a patch to the nuget gallery website, which is operated by Microsoft and owned by the OuterCurve foundation. The patch was accepted and my code, along with the code of others was pushed to nuget.org. So I’ve already submitted source code for a website owned and operated by an independent organization  setup by Microsoft, they’ve already accepted it, and the world seems a slightly better place as a result. So I ask the PASS BoD to consider releasing the SQL Saturday Website source code on github, and I ask the members of PASS to ask their BoD to release the source code as well.

Split testing using nginx proxy cache

My company recently discovering the joys of using nginx as a reverse proxy cache server. This allowed us to significantly reduce the load on our application servers. Of course, as soon as we got this setup working nicely, a request for A/B testing came down the pipeline.

There are some obstacles to conducting A/B testing while using nginx as a reverse proxy cache server.

Obstacle 1: Lack of “sticky” sessions in free nginx product. While there is support for session affinity as part of the nginx commercial subscription, the product didn’t suit our needs. Without sticky sessions each page load would potentially go to a different upstream server. This would render many tests unusable, and would make the site feel disjointed.

Obstacle 2: Since pages are being cached by nginx, all requests received the same cached response. This means you couldn’t serve different versions of the same page.

Obstacle 3: To keep code complexity down, we didn’t want to have to modify our application to be aware of other tests we were performing.

We were able to overcome these obstacles using only the default modules that were part of nginx 1.4.x.

The following are snippets of our server config. The file exists entirely in the nginx http context. I won’t go into the configuration of nginx outside of this file as that information is readily available elsewhere. I’m going to jump around a bit to ease in explanation. The file will be shown in its entirety at the bottom.

upstream upstreamServerA {
    server upstreamServerA.net;
}

The first thing is to define our upstream server groups. In this setup we have defined two server groups (upstreamServerA and upstreamServerB) each with a single server. Each upstream server group represents a version of the site we are testing. We could increase the number of tests by adding more upstream server groups. The server definition is shown with a standard .net domain name for ease of reading, this should be the IP address or location of your application server.

split_clients "seedString${remote_addr}${http_user_agent}${date_gmt}" $upstream_variant {
    50%               upstreamServerA;
    50%               upstreamServerB;
}

Here we make use of one of nginx’s default modules; ngx_http_split_clients_module. The idea here is to setup the split percentage for our tests. What’s actually happening here is that nginx is creating a string composed of the seed string “seedString” concatenated with the client IP address, the client’s user agent, and the current time. Nginx then hashes this string into a number. The lower 50% of the number range gets assigned upstreamServerA and the upper 50% of the number range gets assigned upstreamServerB. This gets saved into the $upstream_variant variable. This segment is only used for each client’s first request.

map $cookie_sticky_upstream $upstream_group {
    default             $upstream_variant;
    upstreamServerA     upstreamServerA;
    upstreamServerB     upstreamServerB;
}

With this segment we are going to check for the presence of a cookie named “sticky_upstream” in the client request. The goal here is to set the variable named $upstream_group based on this cookie. If the value of the cookie is “upstreamServerA” we set $upstream_group to “upstreamServerA”. We do similarly if the value is “upstreamServerB”. If the value of the cookie is neither of these, or if the cookie is not present, we use the value of the $upstream_variant variable as we defined in the previous segment.

Now we can define our server context.

server {
    listen       80;
    server_name  upstreamServer.com;

    location / {
        #Snipped for brevity
    }

    location /admin {
        #Snipped for brevity
    }

    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   /usr/share/nginx/html;
    }
}

We are defining two locations here “/” and “/admin”. We treat “/admin” differently as we want all admin requests to go to a single upstream server. This may not be needed in all setups but I thought I’d show how to accomplish it.

The first thing we want to do in the “location /” context is to set the “sticky_upstream” cookie.

add_header Set-Cookie "sticky_upstream=$upstream_group;Path=/;";

This will make all subsequent requests from the client “stick” to the same upstream server group.

proxy_pass http://$upstream_group;

Now we tell nginx to use the value of the $upstream_group variable as the upstream server group.

proxy_cache_key "$scheme$host$request_uri$upstream_group";

This segment allows us to cache responses based on the $scheme, $host, $request_uri and (the important bit for this post) the $upstream_group. So that we have different caches for each test.

As I discussed briefly, what if we want to send all admin interactions to a single upstream server group? Let’s look at the “location /admin” context:

set $upstream_admin upstreamServerB;
add_header Set-Cookie "sticky_upstream=$upstream_admin;Path=/;";

proxy_pass http://$upstream_admin;[/text/]

<p>We are defining the variable $upstream_admin and setting it to "upstreamServerB". Then setting the client's "sticky_upstream" cookie equal to it. The final bit is to tell nginx to use the value of $upstream_admin as the upstream server.</p>

<p>The file in its entirety can be found below:</p>

upstream upstreamServerA {
    server upstreamServerA.net;
}

upstream upstreamServerB {
    server upstreamServerB.net;
}

#split clients by the following percentages 
#  according to remote IP, user agent, and date
split_clients "seedString${remote_addr}${http_user_agent}${date_gmt}" $upstream_variant {
    50%               upstreamServerA;
    50%               upstreamServerB;
}

#override if "sticky_upstream" cookie is present in request
#  this assures clients session are "sticky"
#  this also allows us to manually set an upstream with a cookie
map $cookie_sticky_upstream $upstream_group {
    default             $upstream_variant;    #no cookie present use result of split_clients
    upstreamServerA     upstreamServerA;    #use cookie value
    upstreamServerB     upstreamServerB;    #use cookie value
}

server {
    listen       80;
    server_name  upstreamServer.com;
    
    location / {
        #Set the client cookie so they always get the same upstream server
        #  during this session
        add_header Set-Cookie "sticky_upstream=$upstream_group;Path=/;";
        
        #Set the upstream server group as defined in the above map
        proxy_pass http://$upstream_group;
        proxy_redirect          off;
        
        # Cache
        proxy_cache one; #use the "one" cache
        proxy_cache_valid  200 302  60m;
        proxy_cache_valid  404      1m;
        add_header X-Cache-Status $upstream_cache_status;
        proxy_ignore_headers X-Accel-Expires Expires Cache-Control;
        
        # Don't cache if our_auth cookie is present
        proxy_no_cache $cookie_our_auth;
        proxy_cache_bypass $cookie_our_auth;
        proxy_set_header        X-Real-IP       $remote_addr;
        proxy_set_header        Host            $host;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
        
        #Set Cache Key based on scheme, host, request uri and upstream group
        proxy_cache_key "$scheme$host$request_uri$upstream_group";
    }

    location /admin {
        set $upstream_admin upstreamServerB;
        add_header Set-Cookie "sticky_upstream=$upstream_admin;Path=/;";
        proxy_pass http://$upstream_admin;
        proxy_redirect  off;
        proxy_set_header        X-Real-IP       $remote_addr;
        proxy_set_header        Host            $host;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
    }

    # redirect server error pages to the static page /50x.html
    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   /usr/share/nginx/html;
    }
}

Creating a minimally viable Centos instance for SSH X11 Forwarding

I recently need to setup a CentOS 6.4 vm for development Java development. I wanted to be able to run Eclipse STS and on said vm and display the X11 Windows remotely on my Windows 7 desktop via XMing. I saw no reason for the CentOS VM to have a local X11 server. I’m quite comfortable with the Linux command line. I decided to share briefly on how to go from a CentOS minimal install to something actually useful for getting work done.

  • /usr/bin/man The minimal install installs man pages, but not the man command. This is an odd choice. yum install man will fix that.
  • vim There is a bare bones install of vim included by default that is only accessible via vi. If  you want a more robust version of vim, yum install vim.
  • X11 forwarding You need the xauth package and fonts. yum install xauth will allow X11 forwarding to work. yum groupinstall fonts will install a set of fonts.
  • A terminal for absolute minimal viability yum install xterm will give  you a terminal. I prefer terminator, which is available through rpmforge.
  • RpmForge (now repoforge) Centos is based on Red Hat Enterprise Linux. Therefore it focuses on being a good production server, not a developer environment. You will probably need rpmforge to get some of the packages you want. The directions for adding Rpmforge to your yum repositories are here.
  • terminator This is my terminal emulator of choice. One you added rpmforge, yum install rpmforge
  • gcc, glibc, etc Honestly, you can usually live without these if you stick to precompiled rpms, and you’re not using gcc for development. If you need to build a kernel module, yum install kernel-devel gcc make should get you what out need.

From here, you can install the stuff you need for your development environment for your language, framework, and scm of choice.

When your PowerShell cmdlet doesn’t return anything, use -PassThru

The other day I was mounting an ISO in Windows 8 via the Mount-DiskImage command. Since I was mounting the disk image in a script, I needed to know the drive letter it was mounted to so the script could access the files contained within. However, Mount-DiskImage was not returning anything. I didn’t want to go through the hack of listing drives before and after I mounted the disk image, or explicitly assigning the drive letter. Both would leave me open to race conditions if another drive was mounted by another process while my script ran. I was at a loss for what to do.

Then, I remembered the -PassThru parameter, which I am quite fond of using with Add-Type. See certain cmdlets, like Mount-DiskImage, and Add-Type don’t return pipeline output by default. For Add-Type, this makes sense. You rarely want to see a list of the types you just added, unless your exploring the classes in a DLL from the command like. However, for Mount-DiskImage, defaulting to no output was a questionable decision IMHO.

Now in the case of Mount-DiskImage, -PassThru doesn’t return the drive letter. However, it does return an object that you can pipe to Get-Volume which does return an object with a DriveLetter property. To figure that out, I had to ask on stackoverflow.

tl;dr: If your PowerShell cmdlet doesn’t return any output, try -PassThru. If you need the drive letter of a disk image mounted with Mount-DiskImage, pipe the output through Get-Volume.

For a more in depth treatise of -PassThru, check out this script guy article by Ed Wilson(blog|twitter).

Getting the Drive Letter of a disk image mounted with WinCdEmu

In my last post, I talked about mounting disk images in Windows 8. Both Windows 8 and 2012 include native support for mounting ISO images as drives. However, in prior versions of Windows you needed a third party tool to do this. Since I have a preference for open source, my tool of choice before Windows 8 was WinCdEmu. Today, I decided to see if it was possible to determine the drive letter of an ISO mounted by WinCdEMu with PowerShell.

A quick search of the internet revealed that WinCdEmu contained a 32 bit command line tool called batchmnt.exe, and a 64 bit counterpart called batchmnt64.exe. These tools were meant for command line automation. While I knew there would be no .NET libraries in WinCdEmu, I did have hope there would be a COM object I could use with New-Object. Unfortunately, all the COM objects were for Windows Explorer integration and popped up GUIs, so they were inappropriate for automation.

Next I needed to figure out how to use batchmnt. For this I used batchmnt64 /?.

C:\Users\Justin>"C:\Program Files (x86)\WinCDEmu\batchmnt64.exe" /?
BATCHMNT.EXE - WinCDEmu batch mounter.
Usage:
batchmnt <image file> [<drive letter>] [/wait] - mount image file
batchmnt /unmount <image file>         - unmount image file
batchmnt /unmount <drive letter>:      - unmount image file
batchmnt /check   <image file>;        - return drive letter as ERORLEVEL
batchmnt /unmountall                   - unmount all images
batchmnt /list                         - list mounted

C:\Users\Justin>

Mounting and unmounting are trivial. The /list switch produces some output that I could parse into a PSObject if I so desired. However, what I really found interesting was batchmnt /check. The process returned the drive letter as ERORLEVEL. That means the ExitCode of the batchmnt process. If you ever programmed in a C like language, you know your main function can return an integer. Traditionally 0 means success and a number means failure. However, in this case 0 means the image is not mounted, and a non zero number is the ASCII code of the drive letter. To get that code in PowerShell is simple:

$proc = Start-Process  -Wait `
    &quot;C:\Program Files (x86)\WinCDEmu\batchmnt64.exe&quot; `
    -ArgumentList '/check', '"C:\Users\Justin\SQL Server Media\2008R2\en_sql_server_2008_r2_developer_x86_x64_ia64_dvd_522665.iso"' `
    -PassThru;
[char] $proc.ExitCode

The Start-Process cmdlet normally returns immediately without output. The -PassThru switch makes it return information about the process it created, and -Wait make the cmdlet wait for the process to exit, so that information includes the exit code. Finally to turn that ASCII code to the drive letter we cast with [char].

Setting the Visual Studio TFS diff and merge tools with PowerShell

I recently wrote this script to let me quickly change the diff and merge tools TFS uses from PowerShell. I plan to make it a module and add it to the StudioShell Contrib package by Jim Christopher (blog|twitter). For now, I share it as a gist and place it on this blog.

The script supports Visual Studio 2008-2012 and the following diff tools:

Enjoy!

Announcing SevenZipCmdLine.MSBuild

This was a quick and dirty thing born out of necessity, and need to make zip files of PoshRunner so I could make its chocolatey package.

I made MSBuild tasks for creating 7zip and zip files out of the $(TargetDir) of an MSBuild project. There is a nuget package for it. Simply include it in your project via nuget and build it from the command line with the following command line:

%windir%\microsoft.net\framework\v4.0.30319\msbuild __PROJECT_FOLDER__\__PROJECT_FILE__ /t:SevenZipBin,ZipBin

This will create project.zip and project.7z in __PROJECT_FOLDER__\bin\Target. To see how to override some of the defaults, look at this msbuild file in PoshRunner.

Source code is available via a github repo, and patches are welcome!

PoshRunner now on SourceForge and Chocolatey

I’ve been periodically hacking away at PoshRunner. I have lots of plans for it. Some of these are rewriting some of it in C++, allowing you to log output to MongoDB and total world domination! However, today’s news is not as grand.

The first piece of news is I made a PoshRunner sourceforge project to distribute the binaries. To download the latest version, click here. Secondly, there is now a PoshRunner chocolatey package, so you can install it via chocolatey. Finally, there is not a lot of documentation on PoshRunner.exe, so here is the output of poshrunner -help.

Usage: poshrunner.exe [OPTION] [...]

Options:
   --appdomainname=NAME                                     Name to give the AppDomain the PowerShell script executes in.
   --config=CONFIGFILE                                      The name of the app.config file for the script. Default is scriptName.config
   -f SCRIPT, --script=SCRIPT                               Name of the script to run.
   -h, --help                                               Show help and exit
   --log4netconfig=LOG4NETCONFIGFILE                        Override the default config file for log4net.
   --log4netconfigtype=LOG4NETCONFIGTYPE                    The type of Log4Net configuration.
   --shadowcopy                                             Enable Assembly ShadowCopying.
   -v, --version                                            Show version info and exit