I have managed to work around the issue of not having mod_rewrite on my IIS hosting account. Since I can use by ASP.NET and PHP on a single site, I took advantage of the free UrlRewriter library and web.config settings. It’s actually pretty cool. I have a new Health & Fitness Clickbank Mall site create that is all done using PHP, but I use a web.config file along with UrlRewriter to handle correctly rewriting URLs. The funky thing is that all of the URLs look like they are for aspx files, but they are actually all served up by my index.php file. It may not be ideal or pretty, but it sure seems to work!
Okay, okay, so I know I’m a bit behind the curve on this one, but I just wrote my first code in .NET that uses regular expressions! Over the years I’ve used regular expressions in AWK, sed, and most recently PHP. I understand the power of regular expressions, but have never gotten to a level of being very proficient in their use. About a year or so ago when I decided to learn PHP so I could hack together some web sites I realized I needed to use regular expressions if I wanted to get anywhere. I used regular expressions within PHP to parse HTML pages. I found it to be pretty painless and felt that PHP was the way to go if I needed any regular expression processing.
Fast forward to today… I’m building a new ASP.NET web site (top secret, can’t give you any details :)) and came up with an idea for some cool functionality, but the only reasonable way to provide the functionality was going to require the use of regular expressions. My first thought was "how can I call PHP from ASP.NET?". Then I remembered that .NET has support for regular expressions in the System.Text.RegularExpressions namespace. I decided that I’d give it a look.
There are certainly some syntactic differences between PHP and .NET around how to invoke and process results, but (as you would hope) the syntax for defining the regular expression is pretty much the same. With a few Google searches to get some example code I was able to get me code to do what I wanted to.
Now that I have regular expressions working in .NET, I have one less reason to consider using PHP! Not that PHP is bad, but my comfort and skill is definitely more along the .NET side of the world.
This morning I made a frightening discovery — Googlebot has been unable to successfully read any of the pages on my new ASP.NET web sites!! UGH! I made this discovery by accident, but I’m sure glad I did. I was using the free Google Keyword Tool and decided to enter in the URL of a page on one of my sites. Now, the thing you need to know is that my sites are written with ASP.NET and I’m using an open source URL rewriter call URLRewriter.Net so I can have meaningful URLs without actually having long URLs.
So, I enter in one of my meaningful URLs into the Google keyword tool for it to analyze. Instead of it displaying keywords about ipods like I expected it listed a bunch of "error, asp.net error, vpscript error, …" keywords. That was not a good sign. I did a few more tests and had the same results. I did a few Google searches for URLRewriting and Googlebot and discovered that what I was seeing was not unique to me. Sure enough, there is a bug in ASP.NET and I was being bitten hard by it. The good news is that this explained why none of my pages were showing up in the Google index!
I did some more searches and research to find a solution. Luckily I was able to find a posting by someone that came up with a very quick and easy way to solve the problem. All I had to do was add an App_Browsers folder to my ASP.NET project, add a file call browserfile.browser with the follow contents:
<browsers>
<browser refID="Mozilla">
<capabilities>
<capability name="cookies" value="true" />
</capabilities>
</browser>
</browsers>
I did this, published my site, and re-ran my tests with the Google keyword tool. The result was a thing of beauty! Needless to say I immediately made the same changes to my other ASP.NET sites and got those re-published as well. Hopefully this means I will start to get some pages showing up in the Google index soon.
This was very frustrating for me to discover. I had felt pretty good about my previous decision to use ASP.NET instead of PHP; but after running into this issue I’m not so sure it was a good decision. The URL rewriting in Apache works without issue. If I encounter another issue like this with ASP.NET I think I’ll have no choice but to rewrite my sites using PHP….COME ON MICROSOFT!!!
P.S. - A more detailed explanation of the issue along with a more complex solution can be found here and on the URLRewriter forum.