Yesterday I was surfing the web for inspiration for redesign armoredcode.com layout and I was digging in some webdesign template websites.
A lot of CMSes use the generator meta tag for statistic purposes. Wouldn’t it be fun having something automated telling server operating system and CMSes version with single HTTP get?
Yesterday, after dinner I relaxed my self and I started gengiscan.
Imagine you’re in the early stage of your security assessment over a web application. You’ve got your target and you want to know more about it.
You’re not one of that testers firing up an point and click automated tool saying you’re a security expert. You want to attack the site like a pro, and of course you’re authorized to make activities.
Let’s make this assumption now, your target is a CMS powered website, so you’re interested in stuff like privilege escalation over content management backend, something basic like cross site scripting but the real deal would be finding SQL injections and get into the backend database.
So, your first problem is to understand which kind of backend software is serving your target. For statistical purposes, CMSes often expose the software used in the backend in a meta html tag called “generator”.
The real deal would be finding SQL injections and get into the backend database.
Please note that this is not the rule the CMS has to fulfill in order to work. A lot of wordpress or drupal powered websites doesn’t use generator meta tag and they work like a charm.
We will spot for this information and if we’re lucky enough we can have the backend software auto detected simply looking at the HTML code.
Those headers can be either not present so our autodetect would fail and we eventually have to recover using venerable scan and service detection techniques. Remember, the goal here is gain as much informations as possibile with a single GET HTTP request in a script. We don’t want to be intrusive during our detection.
Please note that this is not the rule the CMS has to fulfill in order to work.
A lot of wordpress or drupal powered websites doesn’t use generator meta tag
and they work like a charm.
Let’s start with the ruby gem skeleton
So the first stage was to setup rvm creating a gemset for my project so to leave my ruby environment as clean as possible.
1 2 3 4 5
Now with bundler installed, you can ask him to create your rubygem skeleton, the gengiscan one in my case.
github repository is not here, so we must create it and then set your remote origin to point to your archive.
Now it’s time for the first commit.
If you need a good explaination about how setting up your project dependencies you can check this Yehuda Katz blog post about it.
Make sure to follow instruction you can find here and setup rspec for your gem.
Now you’re ready to go and you can start by writing your tests first.
webmock is a rubygem that can be used writing test mocking up target web server behaviour. In a simple way you can stub a request for a specific URL specifiyng the requests headers and telling what the response should have both in body than in the headers.
My first idea was to use mechanize to handle the HTML parsing and retrieve the informations I need but, mechanize and webmock don’t fit so well.
The problem was that mechanize agent get method didn’t honor webmock disabling outbound HTTP connection so causing the testing framework to raise an exception.
Ruby standard Net::HTTP API behaves completely different honoring webmock stubbed request and fulfilling the tests.
1 2 3 4
So the testing framework changed the underlying architecture. True to be told, this is the first time testing is punching my decisions so hard in a software I wrote.
gengiscan will be based on standard Net::HTTP ruby library and nokogiri used to parse the HTML. Mechanize uses nokigiri as well, so it sounds a feasible solution to use it raw over the response HTML code.
A basic rspec test for gengiscan detecting the CMS declared in the mocked up response page.
1 2 3 4 5 6 7 8 9 10 11 12 13
The idea is that gengiscan API would give me back a ruby Hash with the detected informations.
Of course as we write the spec first, running rake spec tests would fail giving us the opportunity to code the detect API to let them pass
The gengiscan API
I decided to move the URI to scan from the class initializer to detect method, so in the latest version (0.30 as far as I’m writing this) you have to use the API this way:
hash would be a ruby Hash containing the HTTP status code, the server operating system family, the X-Powered-by response value and the generator meta tag.
All the values I need comes from the HTTP response so I won’t use nokogiri for them but what Net::HTTP returned me. To get the Generator meta tag value I do need to parse the HTML page.
1 2 3 4 5 6
Nothing magic here, we ask for the web page using Net::HTTP from ruby standard library and we take some values from the response we have.
Generator meta tag is retrieved using the get_generator_signature private method
1 2 3 4 5 6 7 8 9 10 11
If we’re lucky enough and the developer made his job the right way, there will be only one meta generator tag in the HTML code we get it as response, otherwise gengiscan will take the last appearing in the DOM.
We need no more. And the following are some examples gengiscan detecting backend CMSes:
1 2 3 4 5 6 7 8 9 10 11 12
As you may notice, we don’t have always all the information we would have but we automate the fingerprint step with a very simple ruby code that can be reused as much as we like.
Something to remember is that:
- fingerprint using a single HTTP GET may be not as accurate as a full intrusive scan with nmap (or the scanning tool you like most);
- servers may hide some informations so our fingerprint attempt will fail and we must recover surfing the app, looking at the web page extensions, looking in the HTML comments, …
- ruby is fun to be used as programming language for security tool, it’s HTTP API is damn good, it is built for the web, no stories;
- you don’t want to go further in evaluating web site insecurities unless you’re authorized to. Playing fair it is the first rule here on my blog.
Of course gengiscan can be improved with more robust error checking and more fine granuled fingerprinting techniques.
But we will look at this in an upcoming project you will know more about it if stay tuned over armoredcode.com