Subscribe via RSS Feed Connettisi su LinkedIn Le mie foto su Flickr

DotNet Url Rewriting and Caching Engine

[ 13 ] 25/06/2005 | Matteo G.P. Flora

Playing with URL rewrite

I came across URL Rewrite in .NET a little time ago, trying to get rid of the quesrystrings of an old-ASP CMS I deveoped and worked great (see Pitagora / WOE Web Content Management System ) but had the problems that afflict a lot of asp scripts: all the pages are something like

http://myhost/mypage.asp?pagetoget=MyVeryGoodPage.

While working at this I came across a fancy all pourpose URL rewriting and caching engine for any programming language. Strange as it seems this was the very complex simplest solution I found to handle a very simple task. These are the stories of the Starship Enterpri… hemm… of my implementation.

The Main Idea

…or how I got into this mess. Mainly I have a very good CMS ( Pitagora / WOE Web Content Management System , shameless self-promotion intended) that handles all the pages with a URL that is http://myhost/template.asp?page=Title_Of_Page . While this approach is really useful and very good for memory (I only need to remember the title of the page I want to access), it is a hege problem for search engines and web statistics, cause actually ALL pages only call template.asp. What I really wanted was something that could:

  • translate http://myhost/template.asp?page=Title_Of_Page
  • into http://myhost/Title_Of_Page.aspx while using the CMS.

I didnt want to use a mere redirect, maily because it uses a 302/303 HTTP code that means a normal redirect not followed by search engines. And I obviously wanted to make it automatical, not needing to manually create the aspx file for each page. When I first went into URL Rewriting in .NET I though I was really OK. Using ASP I already had a Wintel machine, I can program in ASP.NET, in my imaginery whis had to be as simple as 1 + 1. Of that stuff is Hell made.

How to crash a Wonderful Idea

I started simple (thay all do). The beautiful thing is that if I correctly implement the trick I didnt need the MyFakePage.aspx to REALLY exist on server; I only needed to implement the Application_BeginRequest method on Global.asax. Application_BeginRequest is fired BEFORE the actual page, so I can SIMULATE the page exist and doing my redirect job fast and neat. So Id have something like this:

protected void Application_BeginRequest(Object sender, EventArgs e) { string myBaseUrl = "/vox/"; string strCustomPath; string mypath = Request.ServerVariables["PATH_INFO"]; string filename;

filename = mypath.Substring(mypath.LastIndexOf("/")); string destinationUrl = "/myCMS/template.aspx?page=" + filename.Replace("aspx", ""); Server.Transfert(destinationUrl); }

Ok. Thats all folks. It is all well and done, Its Ok… ACHK!!! It is not. It seems that HttpHandler in machine.Config gets hangry if you try to execute an ASP this way. You cannot. Thats all. Well, I can modify Machine.Config here, but what about my HOSTED site? This could be the end of a dream. To make the things even worst, the same problem afflict RewriteUrl method. I had to quickly devise a new method for being maybe LESS neat but functioning.

Fetching what I need

Well, all in all an ASP page is something that is sent via HTTP. All I have to do is to fetch it a Web Browser will do and have it spitted back to the client. I was starting to hope again… First of all lets get that damn page and store it into a string:

private string getPage(string URL) { string res =""; try { WebRequest wrGETURL; wrGETURL = WebRequest.Create(URL); Stream objStream; objStream = wrGETURL.GetResponse().GetResponseStream(); StreamReader objReader = new StreamReader(objStream); res = objReader.ReadToEnd().ToString(); } catch () {} return res; }

And now we can kindly use it over and over and over…

protected void Application_BeginRequest(Object sender, EventArgs e) { string myBaseUrl = "/vox/"; String strCustomPath; string mypath = Request.ServerVariables["PATH_INFO"]; string filename; filename = mypath.Substring(mypath.LastIndexOf("/")); string destinationUrl = "/myCMS/template.aspx?page=" + filename.Replace("aspx", ""); string destinationHost = Request.ServerVariables["HTTP_HOST"]; Response.Write(destinationHost + destinationUrl; Response.End } This could be the end of the article, as the Machine.COnfig problem above, but once again IM struck with the fact IM on a Hosted Environment. An hosting environment with PAID BANDWIDTH. IM sure you can get my point. Every time I call for a page, it is fetched over and over again by the ASP.NET process. It can really lead into disaster. I need another trick…

Maybe Akamai started from here too…

The more I though about all this, the more I was convinced a cache system of some kind would be a neat job. In addition to making all this article function it would provide the CMS with a cache for content. Not only that. It would make the system an URL Rewriting and caching engine fo EVERY language. Why wouldnt you not use it on PHP (for example) or Pyton etc as long as they are on the same server? And why to limit on the SAME server? You can change the destinationHost string and fetch the content from everywhere in the world, as far as I know. Well… We were talking about the cache… I know that ASP.NET has a beautiful set of caching features, but they are in the PAGE object. I didnt wanted to use it. So I went into the dear old Application object. The trick is easy: have you the page on cache? Yes? Ok, spit it to the browser. No? Ok, get it, store it and spit it to the browser. Likethis: protected void Application_BeginRequest(Object sender, EventArgs e) { string myBaseUrl = "/vox/"; string strCustomPath; string mypath = Request.ServerVariables["PATH_INFO"]; string filename; filename = mypath.Substring(mypath.LastIndexOf("/")); string destinationUrl = "/myCMS/template.aspx?page=" + filename.Replace("aspx", ""); string destinationHost = Request.ServerVariables["HTTP_HOST"]; string destinationAddress = destinationHost + destinationUrl; if (Application[destinationAddress] == null) { Application[destinationAddress] = getPage(destinationAddress); Response.Write(Application[destinationAddress ].ToString()); Response.Write(""); } else { Response.Write(Application[destinationAddress ].ToString()); Response.Write(""); } Response.End } Thats all. I now can navigate every fake page and get it from the CMS. I can gat it stored so that I have a cache and I can even add a comment on the bottom line to remind me where it comes from (open the resulting HTML and youll see).

I need to change a page!

You should have noted that if you want to change a page you need to restart the application to erase Application values. Once again, being hosted, I needed to find some sort of sideway to handle this (and seriously thinking of housing ;). A very quick way is to catch if a particular aspx pafe is called (suppose to call it clearall.aspx) and doing the job. Something like this: if (filename == "clear") { Application.RemoveAll(); Response.Write("Done it!"); } That should handle all the stuff pretty easly.

Playing with Content

Please, remind that we have all the page into a string! That means we can alterate it as difficoultly as a Replace(“oldVar”, “newVar”) is. We can, for example, modify all HREFs to reflect a new position with a Replace(“href=”", “hraf=”/mynewdir/”), or even substitute the title of the site we are thieving the content from with our one (ok, ok, thats too nasty ;) I promise not to do that…).

Conclusion

While this article is FAR from being a Guru vision or something like that, I hope it will provide you some good idea to experiment your way to this approach. I do not know if this “stripped down” version of the actual system I use is safe (I am a little concerned about memory allocation of Applications after a while), but is a good start for implementing something better. Please, avoid sending me flame mails about hwo my code-style sucks. I know it. IM good at inventing things, not at the fine carpentering job of stylish-coding… And if you create a polished version Id be more than glad to take a look at it! It has helped me much, it was fun to write it and right now my site ( Lastknight Dot Com ) runs by using it (well, a more sophisticated and database driven version of it, but this was the core idea). As far as I know it funcions and my statistics are referring the ASPX page correctly and nicely. Its fun to have all the stats made on PAGES (top view, permanence, etc) as if I wasnt at all using a CMS… It takes the better of both worlds. And Google is handling the page as a static one. That means I have more than ONE PAGE in all the domain, right now. Astionishing, isnt that? Have my greetings and I’ll be most pleased to know about your experimenting with this, as well as to correct the thousands scores of errors I made in this article that youll kindly signal.

I termini più ricercati per il post:

No related posts.

I termini più ricercati per il post:

Condividi:
facebook twitter delicious google digg reddit technorati su buzz mixx myspace

Categoria: Uncategorized

Comments (13)

Trackback URL | Feed dei commenti

  1. Anonymous says:

    You are doing a wonderful thing here on the Internet. I wish you the very best. Kindest regards.

  2. anteojo says:

    Hola:

    How install this?

    Thanks!

  3. anteojo says:

    can you send me global.asax

    Don´t work your example in my server

    Thanks

  4. Matteo says:

    Due to some fairly deep abuse of this technique for SEO pourpose I am NOT releasing the finished application anymore. If people is skilled enough to know a little more than script-kiddie level programming will certainly be able to use the information provided. This article is intendet to be used by professionals, not by occasional programmer in search of SEO fraud.

  5. Anonymous says:

    Hi everyone A big thank you for this wonderful site, it has helped me immensely

  6. Anonymous says:

    Good morning, I am new to this site. I have just learned about this site. I am going to read on and it’s very interesting to know

  7. Anonymous says:

    This is a great web site. I have some great web pages myself if you are interested to share. But I should not go on about my site too much, that is not fair, right?

  8. Anonymous says:

    Thank you for the great web site – a true resource, and one many people clearly enjoy

  9. Corny says:

    Molto bene, grazie mille!

  10. Maxgrante says:

    Ciao Matteo, complimenti per l’ottimo articolo, per chi volesse invece approfondire l’URL rewriting sotto Linux con Apache (o anche sotto Windows ma con Apache) vi segnalo il mio articolo: http://www.massimo-caselli.com/2006/01/08/mod_rewrite-apache-sviluppo-siti-internet/

    Ciao a tutti, Max

  11. Will says:

    I created a URL rewriting script that supports regular expressions. Basically it is an extension of this script and others. Initially it was written in VB .NET and I can provide C# examples also.

    http://www.willasrari.com/blog/application-beginrequest-code-in-globalasax/00043.aspx

  12. Arun sharma says:

    We need to create the resource file first.Following are the steps for that:

    1.Create a text file;eg”abc.en-us.txt”. 2.Here “abc” is an key like a ,b, c can contain different text values. 3.Go to command prompt of visual studio and go the directory of the website and then into particular project file. eg c:/Websites/Project1 3.Type “resgen” abc.en-us.txt abnd press enter, this will write the resource file.

    Now we need to make changes in global.asax, here is my code of this:

    <%@ Application Language=”C#” %> <%@ Import Namespace =”System.Resources” %> <%@ Import Namespace=”System.Threading” %> <%@ Import Namespace=”System.Globalization” %>

    void Application_Start(object sender, EventArgs e) 
    {
        // Code that runs on application startup
        Application["abc"] = ResourceManager.CreateFileBasedResourceManager("abc", Server.MapPath("."), null);     
    }
    
    protected void ApplicationBeginRequest(Object sender, EventArgs e)
    {
        try
        {
            Thread.CurrentThread.CurrentCulture = new CultureInfo(Request.UserLanguages[0].ToString());
        }
        catch (Exception ee)
        {
            Thread.CurrentThread.CurrentCulture = new CultureInfo("en-us");           
        }
        Thread.CurrentThread.CurrentUICulture = Thread.CurrentThread.CurrentCulture;     
    
    }
    
    void Application_End(object sender, EventArgs e) 
    {
        //  Code that runs on application shutdown
    
    }
    
    void Application_Error(object sender, EventArgs e) 
    { 
        // Code that runs when an unhandled error occurs
    
    }
    
    void Session_Start(object sender, EventArgs e) 
    {
        // Code that runs when a new session is started
    
    }
    
    void Session_End(object sender, EventArgs e) 
    {
        // Code that runs when a session ends. 
        // Note: The Session_End event is raised only when the sessionstate mode
        // is set to InProc in the Web.config file. If session mode is set to StateServer 
        // or SQLServer, the event is not raised.
    
    }
    

    How to use this in a web page IE. .aspx page

    here is the source code:

    using System.Globalization; using System.Threading; using System.Resources;

    public partial class Default : System.Web.UI.Page { ResourceManager oResourceManager; protected void PageLoad(object sender, EventArgs e) { oResourceManager = (ResourceManager)(Application["abc"]); if (!Page.IsPostBack) { Label1.Text = oResourceManager.GetString(“a”); Label2.Text = oResourceManager.GetString(“b”); Label3.Text = oResourceManager.GetString(“c”); } } }

    Hope this code will help.

  13. dario says:

    Articolo buono!!!

Lascia un commento




Se vuoi visualizzare la tua foto con il commento vai a Gravatar.