Had an issue come up today with a site that offers the service to pull in data to your site using an iframe. Apparently IE has a security feature where it does not accept sessions from external sites using iframe. Sounds great, but the way to make it work is to simply send a header with the request.
DISCLAIMER: I would like to say I do not condone doing this. Better ways, more legal, ways to get content from someone. But sometimes this is asked of you by your boss. DO NOT STEAL CONTENT.
For this weeks Thursday Code Tip I will show how to use PHP to crawl a website to gather content. First we start by selecting the URL to crawl:
1
$sURL="http://www.defvayne23.com/";
Next we get the content of the page:
2
$sContent=file_get_contents($sURL);
Now to use REGEX to get what we want. You can learn patterns here. Below we search for the text within a H1 tag.