Me Against Google

Update: This update is up at the top so that some people may keep their blood pressure in check. Since I posted this, Google, or at least Matt Cutts on his personal blog, did indeed choose to communicate with me, saying that they did try to send a warning email. They posted it, and it was a good email, with sufficient information that if I had actually received it, it would have been perfectly reasonable as something to orient me to fix the problem on the TOA site. Over on Matt’s blog, I suggested making warning email content available through the Google Webmaster Tools interface. I think that is being considered. And, best of all, Google re-indexed the TOA site on 2006/12/05. Back to the original post:

====

Google has done a lot of cool and useful things. But in one area, Google has failed badly in setting its policy. It has to do with when Google decides to de-index a website, that is, to remove all references to an entire domain from any search returns that it provides. You might think that a company that prides itself upon advanced textual analysis and automated decision-making algorithms might provide helpful warning messages to webmasters concerning problems found in their sites. You would be wrong. [Google’s Matt Cutts says that Google did send a warning email that I did not receive. — WRE] If Google decides that there is a problem, they will de-index the entire site, never attempt to communicate with the webmaster concerning their action, and (this is the big problem part) they refuse to tell a webmaster what the problem was or where the problem occurred, whether or not the webmaster deliberately created the problem or was the victim of some of the all-too-common website cracking that happens nowadays. Google’s policy of keeping problems secret is harmful, and in fact favors cheaters over honest webmasters.

How do I know this? Because my website got cracked and Google decided to de-index it.

No pages from your site are currently included in Google’s index due to violations of the webmaster guidelines. Please review our webmaster guidelines and modify your site so that it meets those guidelines. Once your site meets our guidelines, you can request reinclusion and we’ll evaluate your site.

I got to that message about the TalkOrigins Archive (TOA) sometime early Friday morning. A post on the talk.origins newsgroup cc’d to my email gave me a heads-up that the TOA no longer appeared in the Google index. That site is also my responsibility, so I started looking into the problem.

First, I just tried looking at Google’s public information about the TOA. That was all happy news, with a cute check-mark graphic and the note that the TOA was successfully indexed last on November 27th. I then claimed the TOA as my website through Google’s Webmaster Tools and verified it. Then, I got the rather different site summary that is quoted at the beginning of this message.

The TOA is, essentially, the same sort of site it was back in 1995. We have mostly HTML files. There is a small amount of Javascript used to protect email addresses of contributors from spammers. There is the feedback section, which is based on Perl CGI. There are a few PHP pages, notably for providing RSS. There are some PDF files. And, of course, some JPG, GIF, and PNG images. We don’t have a “search engine optimization” consultant. We don’t do third-party advertising. (The closest thing to an ad is our link on the front page that goes to our group weblog, the Panda’s Thumb.) Our content is original and has received numerous awards and recognition.

So, what, precisely, was causing Google to not like us anymore? The essential lesson here is that Google would not tell us. That isn’t mere caprice; that is Google policy. I tried to find a more extensive explanation since the first thing was labeled “Summary”. No such luck. That is not available via Googe Webmaster Tools. There was a listed phone number for Google, so I tried calling them. After proceeding as directed through their phone menu, I got a recorded message that all issues to do with indexing were handled through the web site and that Google did not offer any customer service via the phone. There was, of course, nothing further available from the web site, with the exception of the form for requesting reinclusion of a site. That doesn’t tell you anything about your problem in particular.

My mission, whether I liked it or not, was to find and fix whatever problem the TOA might have, with no guidance as to what the problem was and nothing at all about where to start looking. Since the TOA site is 5,000+ separate pages, that could be quite the task. I started with the default site page. I pulled it up in my browser. It looked completely unexceptionable. I then opened up a “page source” view. There, I did find something wrong. At the bottom of the page, buried within an ASP function that prevented it from being visible on browsers, was a block of bad links, links that had nothing to do with the TOA. Checking the file on the server, I found that it was changed on 2006/11/18. There was no corresponding entry in the TalkOrigins Archive Delegation (TOAD) change log, where the authorized TOA volunteers note each change made to the site. We had been cracked.

Within ten minutes, I had the bad stuff out of the default page and uploaded the clean file to the server. I informed the TOAD group of what I had found out and requested Douglas Theobald check his local copy of the files for any further cracked files. He found none. Douglas suggested that I post something to the Google Webmaster Help Group, which I did. I then entered the reinclusion request, clarifying the three stipulations that Google requires by a checkbox, without which one cannot submit the reinclusion request. Let’s have a look at what Google says on that form:

Reinclusion statement

By submitting this form, I acknowledge that:

* I believe this site has violated Google’s quality guidelines in the past.

* This site no longer violates Google’s quality guidelines.

* I have read and agree to abide by Google’s quality guidelines.

Tell us more about what happened: what actions might have led to any penalties, and what corrective actions have been taken. If you used a search engine optimization (SEO) company, please note that. Describing the SEO firm and their actions is a helpful indication of good faith that may assist in evaluation of reinclusion requests. If you recently acquired this domain and think it may have violated the guidelines before you owned it, let us know that below. In general, sites that directly profit from traffic (e.g. search engine optimizers, affiliate programs, etc.) may need to provide more evidence of good faith before a site will be reincluded.

Soemthing to note here is that this goes beyond the mechanics to issues of ethics, what with all that emphasis upon “in good faith”. Google can, of course, apply any standard they like, and include or exclude sites at their pleasure. Of course, Google doesn’t want to appear to be capricious; with the above statement, Google obviously wants to cast itself as a judge of moral worth of sites, implying that they themselves are worthy of the role of judge in a court of equity. They absolutely invite evaluation of their own actions in an ethical framework, and my opinion is that Google doesn’t measure up in the sphere of how they handle de-indexing decisions. Certainly, they have the responsibility to keep their index from giving unwarranted weight to cheaters. That is not at issue here. What is at issue is their treatment of webmasters whose sites have acquired problems that may — or may not — actually be of their making.

Their stipulations for submitting a reinclusion request require an admission of guilt on the part of a webmaster who, as I found myself, could be the victim of a third party. Google’s policy of obscuring their reasons for de-indexing makes it much harder for honest, but cracker-victimized webmasters to return their sites to a state that is acceptable to Google. In fact, Google’s policy is far more burdensome upon honest webmasters than it is upon cheaters — the cheaters know what they have done that is out of compliance, and the honest webmasters have no such knowledge of where the problem may lie.

So I had to clarify my response to Google’s stipulations:

The TalkOrigins Archive has never deliberately violated Google’s quality guidelines. Our site has operated since 1995 in the same way, well before the origin of Google, and will continue to provide quality information to our readers even if Google ceases to exist as an entity. We never needed Google’s quality guidelines in order to make a quality website, and we would not lower our quality if Google decided to impose guidelines that were injurious to the standards that we have ourselves set and maintained.

I was extremely lucky. The damage to my site was limited and in the first place that I happened to look. Other honest webmasters might not be so lucky. They may have to undertake an arduous process of vetting pages, essentially having to second-guess the mind of the cracker in trying to locate a problem that Google knows the exact location of. Does that sound anything like equitable to you? It sure doesn’t to me.

As I said in my post to the Google Webmaster Help group, the Google policy of obscuring de-indexing decisions is harmful.

Wesley R. Elsberry

Falconer. Interdisciplinary researcher: biology and computer science. Data scientist in real estate and econometrics. Blogger. Speaker. Photographer. Husband. Christian. Activist.

50 thoughts on “Me Against Google

  • 2006/12/04 at 4:47 am
    Permalink

    Thats not googles fault… thats yours. Your ISP will do the same if it finds that your server has been hacked and being used as a zombie. This is what you get for using Microsoft. Stop using crappy Microsoft products which can be hacked by a 4 year old and start using a decent server like Linux with Apache.

    Then you wouldn’t have these problems. You should also know how to administer your server as well so you don’t get hacked. Again, not Googles fault and more your own fault.

    As a web developer, I’d say google did the right thing. You didn’t secure your server and potentially put their data at risk. Your negligence is why you got yanked.

  • 2006/12/04 at 4:53 am
    Permalink

    I appreciate you dillema, indexing by google is almost equivalent to having telephone service (not a necessity, but a practical necessity of doing business).

    However, you might want to consider if the phone company noticed that someone was leeching your phone line and making excessive amounts of phone calls (and you had some sort of “unlimited call” service). The phone company would likely notify you that you have somehow violated your terms of service (that you only had phone calls from one business, not enough calls for a whole call center) and it’s up to you to fix the situation while they disconnected you (not your phone connectivity, just your outgoing “unlimited” call service). Of course you may have had no knowledge of the situation and people were doing things without your knowledge, or perhaps you colluded with your neighbors to save on their phone costs. The phone company has no way of knowing which is the case and their special service (outgoing unlimited phone calls) was at their pleasure, not your right. So although you might be equivalently pissed about this, and denigrate the phone company for their apparant lack of morality and burdening the little guy with figuring out the situation (which they may not be equiped to do), the situation is not too different.

    Your site is still connected to the internet and people can reach you and other search engines and web sites still link to you. Google can do nothing about that. Perhaps this illustrates more the “cold” nature of the corporate structure the internet business has become and less the “evilness” of a certain large search company.

    On the other hand, sometimes we somehow expect other folks to just recognize us as the “good-guy” when in fact we are just another anonymous person in the sea of deception that is life that others don’t have the time to vet. It a common personal conceit that we all fall victim to at one time or another and it reflects less on the coldness of the world, but more on it’s vast nature and our small insigificant place in it.

  • 2006/12/04 at 5:18 am
    Permalink

    Actually, while I feel for you (I’ve been through the anti-spam equivalent), Google’s actions do make sense. Let me rephrase that. They are unreasonable to reasonable people like ourselves, however they are completely appropriate considering the population of unreasonable people.

    I’m betting that it’s simply a matter of numbers. If Google did anything to help webmasters avoid/solve any kind of problem, then webmasters in general would rely on it. Things like: “if they notice, we’ll fix it then” or “let google do the work of determining if it’s good enough”. That sort of thing.

    The only way to ensure that Google doesn’t wind up being used and abused is to ensure that they never assist any webmaster in these cases.

    So I’m completely on-board with them — an uncomfortable place to be since I recently started avoiding Google like the plague for the sole reason that I don’t like what they do with the data that people provide to them.

  • 2006/12/04 at 5:30 am
    Permalink

    suck it up
    just re-read their terms of service and fix the problem.

    They don’t have enough time to deal with all the technically incompetant people like you.
    You do realise how many people that would be if just 0.01% of websites needed help.

  • 2006/12/04 at 5:31 am
    Permalink

    whiner!

  • 2006/12/04 at 6:17 am
    Permalink

    Quit whining Google responds quickly. Clean up your act and move on.

  • 2006/12/04 at 6:38 am
    Permalink

    Sorry felt bad about the snide remark above.

    I can empathize. I recently had several sites that I host find their way into Google hell. I wrote a small 15 line Perl script (using File::Find) to hunt thru thousands of pages and return all url patterns.

    Filtered out local domains and easily found the problem with a run away forum that spammer zeroed in. Looks like they were also using a Perl script :{

    They had uploaded over 3000 links – the bastards.

  • 2006/12/04 at 7:43 am
    Permalink

    “Their stipulations for submitting a reinclusion request require an admission of guilt on the part of a webmaster who, as I found myself, could be the victim of a third party.”

    If you allow a third party to deface/alter/hack your site IT IS YOUR FAULT!!! Where do you get off claiming victim here (to Google’s policies)? Google should ban you, in order to protect users from whatever garbage might show up on your site, since, clearly you have no control over it.

  • 2006/12/04 at 8:46 am
    Permalink

    Hi Wesley, my name is Matt Cutts and I’m the head of the webspam team at Google. I can confirm that Google de-indexed talkorigins.org because it was hacked and therefore talkorigins.org had phrases such as “rape sex,” “animal porn,” and “beastiality” on your main page.

    After Google removed the site, we made the penalty visible in our webmaster console so that you could definitively tell that your site had a penalty, as you mention above that you saw. My records also indicate that we tried to contact talkorigins.org by email to alert the site that it had been hacked.

    You did the right thing by filing a reinclusion request. I see that the porn/sex words have been removed from your site, so I’m revoking the penalty now.

    I’ll also do a more lengthy post at my blog at http://www.mattcutts.com/blog/ as well to talk about this situation.

  • 2006/12/04 at 9:06 am
    Permalink

    hi there,

    ok, just a heads up, this has been discussed a lot over at slashdot and in the interests of trying to educate you a little, I thought to post the link to the discussion.

    http://slashdot.org/articles/06/12/03/2049202.shtml

    now, before you go there, remember, in free speech, there are opinions you like and those you don’t, admire those you accept and denouce those you don’t, but opinions are just that. In there, you’ll find the reasons why google didnt inform you about the problem or why. Try to filter out the wheat from the chaff and you’ll find your reason.

    I don’t personally find your topics that interesting, but thats not really the point, the point is to help you, regardless of what your site says and does.

    chris.

  • 2006/12/04 at 12:27 pm
    Permalink

    Worth of mention is that Google’s PageRank makes links from the frontpage (or any other page that is frequently linked to as an entry page) worth more than links from pages that are buried deep within the linkstructure of the site in question.

    Which means most of the time any of these links will be on one of the entry pages.

  • 2006/12/04 at 12:37 pm
    Permalink

    When we were delisted, we filed a reinclusion request. I was told by an employee at Google that such requests are ignored; they are just there to make webmasters feel that they have done something.

    You may have also noticed that, by filing the reinclusion request, you absolve Google of all legal responsibility or damages from having delisted you.

  • 2006/12/04 at 3:25 pm
    Permalink

    I believe that this is why digital resources need to be cataloged in the manner that monographs (books) and other physical resources are when they are acquired by a library. I just completed a paper on this, and I used the recent de-indexing of UD by Google as an example. (BTW, I argue in my paper for the cataloging of a site like UD despite what I think of the material therein. I did not think that UD should have been de-listed by Google, either.)

    The Internet is not always “forever” yet there is a backlog of electronic communication just left out there and librarians are not hopping on it. Some even think that websites and blogs, being electronic (and therefore “magic” in some way) will index themselves in a coherent manner, when in fact, even though Google and other search engines provide this service (of a sort), the methods and algorithms are not transparent, as you point out. (Cataloging rules are transparent—inasmuch as one can understand the bureaucracy of regulations in AACR2, LCSH, etc. Hey, it’s all designed.)

    A cataloger makes resources available via a subject search, as opposed to a keyword search which is much less accurate. Weblogs are where the crucial discussions are happening and they deserve to be cataloged, preserved, and archived. How we go about doing this is a question that I shall be raising again and again as I work toward my MLS. Certainly a question like this is something that I’d like to bring up when I get my paper back. In the meantime I wish you all the best, Wes.

  • 2006/12/04 at 4:22 pm
    Permalink

    I want to thank Matt Cutts for dropping in to mention his reply. I replied at his weblog. I’ll quote that here:

    Matt,

    I think that the message you show as a warning is excellent. It clearly states what is wrong, with enough information to permit a webmaster to locate the problem.

    I only wish that I had actually received it.

    Before I made my complaint, I checked my incoming email. There was no sign there of an attempt to contact me from Google. Lunarpages.com, where the TOA is hosted, forwards email to my account.

    This morning, I learned of this post, so I re-checked my steps. No, still nothing in my incoming mail. I looked for strings from within the warning, to see if the text came through without an obvious “google” connection. No luck on that, either.

    I rely upon the Lunarpages email forwarding, but given this post, maybe I was wrong to do so. I logged into the domain’s Lunarpages webmail interface for the first time. I searched for anything with “google.com” in the from field. I searched for strings from within the warning message quoted above. Still nothing.

    Bummer. That just leaves examining the SMTP records on my local email account. Google’s message should have been relayed by Lunarpages, so I looked for that in the SMTP logs. Still nothing.

    My SMTP logs, BTW, do show rejects for hosts like wr-out-0708.google.com, which is apparently blacklisted at spamcop.net. The following shows rejects on the 28th with a “google.com” domain. I haven’t checked these for spoofs, but I assume that’s what’s up with these:

    Nov 28 02:14:49 […][63343]: ruleset=check_relay, arg1=wr-out-0708.google.com, arg2=64.233.184.250, relay=wr-out-0708.google.com [64.233.184.250], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.184.250

    Nov 28 11:39:34 […][84035]: ruleset=check_relay, arg1=py-out-1314.google.com, arg2=64.233.166.175, relay=py-out-1314.google.com [64.233.166.175], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.166.175

    Nov 28 11:39:38 […][84034]: ruleset=check_relay, arg1=py-out-1314.google.com, arg2=64.233.166.175, relay=py-out-1314.google.com [64.233.166.175], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.166.175

    Nov 28 12:18:47 […][85490]: ruleset=check_relay, arg1=wr-out-0708.google.com, arg2=64.233.184.244, relay=wr-out-0708.google.com [64.233.184.244], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.184.244

    Nov 28 12:23:01 […][85593]: ruleset=check_relay, arg1=wr-out-0708.google.com, arg2=64.233.184.240, relay=wr-out-0708.google.com [64.233.184.240], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.184.240

    Nov 28 12:23:01 […][85592]: ruleset=check_relay, arg1=wr-out-0708.google.com, arg2=64.233.184.250, relay=wr-out-0708.google.com [64.233.184.250], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.184.250

    Nov 28 18:24:09 […][96244]: ruleset=check_relay, arg1=wr-out-0708.google.com, arg2=64.233.184.250, relay=wr-out-0708.google.com [64.233.184.250], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.184.250

    Nov 28 18:30:36 […][96470]: ruleset=check_relay, arg1=wr-out-0708.google.com, arg2=64.233.184.250, relay=wr-out-0708.google.com [64.233.184.250], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.184.250

    It would be ironic, though, if the warning message that could have short-circuited this whole affair was blocked because of spam filtering.

    As for entitlement, I don’t think that I was out of bounds given the information I had to work with. Google certainly isn’t responsible for fixing the bad stuff that is on my site. I never said it was. Having a third party mess with the site caused the problem in the first place. Having tried to work with Google once the problem became known to me resulted in… nothing. Not until I complained about what happened.

    I do feel a bit better to know that Google made an attempt at contact before de-indexing our site. And it is good to know that the site is scheduled for re-indexing within a couple of days, rather than the couple of weeks mentioned on the Webmaster Help Group. I wish Matt and the rest of the folks at Google success in making the process better in the future. If, when I did claim the TOA site via Google Webmaster Tools on Dec. 1, the text of that warning that you quote above had been waiting for me, I would have had no complaint to make. It seems to me that if Google is willing to send that level of information via email, then making it available to the verified owner of a site via the Webmaster Tools interface should not be a problem, either.

    And, Adam, neither Google nor you can tell whether the problem is a deliberate cheat or an honest person vicitimized from the problem itself. Google is entirely correct to protect their index by pulling sites that are not in compliance with their guidelines. I never said otherwise. My complaint was what Google’s policy of obscuring the de-indexing decision in the aftermath created, which is a situation in which cheaters have an advantage over honest webmasters, since the cheaters have knowledge of where in their pages the bad stuff lies, and the honest webmaster does not have that knowledge. Whether or not you may accept that I qualify as an honest webmaster, the policy as it currently stands obviously puts honest webmasters at a clear disadvantage.

  • 2006/12/04 at 4:34 pm
    Permalink

    So, wait, your shitty site was hacked, you never noticed but google did and took the appropriate action and now you are whining about it?

    You’ll have to excuse me if I stopped reading after the first two paragraphs as they provided me with all I need to award you the ‘not-so-intelligent design award’. The rest was just rambling on and on and on about what a hard time you had contacting Google. You fail at being a web master, and at life.

  • 2006/12/04 at 4:36 pm
    Permalink

    Hi Simson, it’s not true that Google ignores reinclusion requests. People on my team do process them, and while we don’t approve every request (e.g. if a site is still spammy), we do approve lots of sites for reinclusion.

  • 2006/12/04 at 4:46 pm
    Permalink

    I believe that this would certainly be the webmaster’s problem and responsibility. However, why couldn’t Google provide, in the webmaster tools section of their site, a log of the activity that violated policy?

    If they can find the error, they can log the error and offer it to the webmaster. I’m sure they could handle the minor increase in resource usage to provide that help, couldn’t it?

  • 2006/12/04 at 5:05 pm
    Permalink

    Wesley,

    I think an avenue of exploration is being overlooked here. Based on the “lively” discussion between AIG, ICR and TOA on the Google Groups thread you linked, I had a notion:

    Perhaps TOA was cracked on purpose, and made to be de-listed from Google as some form of retaliation. Granted, the damage was only to your index page, but it bears exploration.

    If you haven’t already taken up the scouring of server logs, you might want to look into that to see if there is any trace of when and how your site was cracked.

    Overall, a sobering story.

  • 2006/12/04 at 6:01 pm
    Permalink

    Randy,

    It is possible that the TOA was cracked for political reasons. However, the TOA gets enough traffic anyway that it would be a prime spot for a cracker seeking to improve their search engine rank to target. There are, I think, far more competent crackers (you may read that two ways) outside the antievolution contingent than within, so I think it is likely that the TOA was cracked simply because it was a fat target.

  • 2006/12/04 at 6:13 pm
    Permalink

    Matt Cutts,

    If you are still monitoring comments here, please note that I have added a link to your weblog post in my post above, noting your account of sending a warning email.

  • 2006/12/04 at 7:00 pm
    Permalink

    First, I’ll say that I absolutely agree with your criticisms of Google. Finding guilt and passing summary judgment without providing plaintiff opportunity for discovery or to mount a basic defense is unconscionable.

    That being said, I cannot ignore an overlooked benefit of the ordeal — your site had been hacked, and excepting Google’s action, you most likely would never have discovered the violation. Though I object to the means, you appear to have come out the better as a result.

  • 2006/12/04 at 9:53 pm
    Permalink

    Wesley
    You were hacked once. Nowhere do I see reference to your having implemented a strategy to prevent it’s happening again. Fine, maybe you don’t want the bad guys to know step-by-step how you are attempting to stop them.

    By now you should have had some forensic analysis done to detect their entry mechanism.
    You need to look for back doors, root kits, etc that they may have gifted you with.
    You need to be up to date with patches.
    You need to disable all unused services and ports.
    You need to keep track of current zero day exploits and work-arounds.
    You need to ensure that they can’t use social engineering against your authorised site controllers.
    You need advice from a trusted source to manage all the above on an ongoing basis.

    If you fail to do so, then presumably it will only be a matter of time before you find out if Google has implemented the improvements in their Webmaster Help Group interface that you would like to see. They will try again.

    It’s a valuable site, hopefully you can find the necessary help.

    John

  • 2006/12/04 at 10:09 pm
    Permalink

    Xeno Said:

    This is what you get for using Microsoft. Stop using crappy Microsoft products which can be hacked by a 4 year old and start using a decent server like Linux with Apache.

    Then you wouldn’t have these problems. You should also know how to administer your server as well so you don’t get hacked. Again, not Googles fault and more your own fault.

    If you knew anything about Wesley and the sites he runs you’d take back that comment. It is true that Wesley doesn’t use Linux and Apache. He uses FreeBSD and Lighttpd. Most of the websites he manages run off of that setup, including this one here. Talkorigins.org is the only one that is on a commercial hosting company, which now appears to have been hacked.

  • 2006/12/04 at 11:33 pm
    Permalink

    I think that was a drive-by sneering. I doubt he’ll see your reply, Reed.

  • 2006/12/05 at 12:48 am
    Permalink

    There’s your problem. You (or someone running your email) uses the Spamcop BL to block email. That’s just a bad move if you care about receiving email.

    Here’s what Spamcop says about their BL:

    “The SCBL aims to stop most spam while not blocking wanted email. This is a difficult task. It is not possible for any blocking tool to avoid blocking wanted mail entirely. Given the power of the SCBL, SpamCop encourages use of the SCBL in concert with an actively maintained whitelist of wanted email senders. SpamCop encourages SCBL users to tag and divert email, rather than block it outright. Most SCBL users consider the amount of unwanted email successfully filtered to make the risks and additional efforts worthwhile.

    “The SCBL is aggressive and often errs on the side of blocking mail. When implementing the SCBL, provide users with the information about how the SCBL and your mail system filter their email. Ideally, they should have a choice of filtering options. Many mailservers operate with blacklists in a “tag only” mode, which is preferable in many situations.”

    In other words. If you use this to block email outright expect to lose email you care about. Guess what? You lost email you care about.

  • 2006/12/05 at 1:43 am
    Permalink

    I’ve already explained why the SpamCop setting is not likely to have interfered with a warning email from Google over on Matt Cutts’ weblog:

    – The timestamps don’t match the timestamp Matt gave.

    – The addresses Matt used were Lunarpages.com aliases, and thus forwarded from Lunarpages to my local email server. The rejections do not correlate with Lunarpages.com incoming mail.

  • 2006/12/05 at 3:03 am
    Permalink

    Wow, what is up with all the condescending comments on this page? Shows how mature you guys are.

  • 2006/12/05 at 3:17 am
    Permalink

    This is Interesting topic and I was once a victim of site hacking before… But was able to fix it, and I didn’t recieved any notifications from google too. anyway, my case is not as worst than this one because I fix it right away that didn’t come to the point where the google has to remove my site to the search results.

  • 2006/12/05 at 2:43 pm
    Permalink

    Too bad that the email addresses that Google used weren’t monitored (or valid, I suppose). If they were, that would have saved you a lot of stress.

  • 2006/12/05 at 6:09 pm
    Permalink

    Google did you a service, they told you that your stupid website was hacked. Now you’re mad at them!? Did yah00 tell you that? No, google did the right thing and told you about problems with YOUR website.

    You should be ashamed of yourself!

  • 2006/12/05 at 6:55 pm
    Permalink

    I just read Matt Cutt’s response to this situation…What more could you ask Google to do?

  • 2006/12/05 at 7:01 pm
    Permalink

    Actually, according to Cutts Google is considering what more to do that I suggested: making the warning information available via Webmaster Tools. Seems to me like that would be an improvement.

  • 2006/12/05 at 7:28 pm
    Permalink

    I just now checked, and the TOA has been re-indexed by Google. I want to thank Google for their promptness in re-indexing the site.

  • 2006/12/05 at 8:19 pm
    Permalink

    Hi Wesley,

    Glad things are resolved. I know just where you are coming from. I run a popular FSBO (real estate) site that got hacked when a patch to my CPanel scripts never applied properly. My traffic plummetted and It took me a while to realise what was the cause.

    The parasites had placed a small piece of code in my index page and also an include page. When visiting my main page everything loaded, but there was a strange url that flashed on the status bar when it loaded. Despite having anti virus software I never got any warning. It turns out it was a trojan.

    When someone goes to click a link to my site from google they get a huge warning from stopbadware.org advising them to go elsewhere. This changed over the course of 2 weeks from still showing a link to my site on the warning to having no link. Contacting stopbadware was easy but you have to wait for them to crawl your site and check it’s clean. I failed the crawl as I missed the include page that was hacked, therefore I was sent to the back of the queue. I changed all passwords checked all files on the server.

    All in all it took a month for the warning message to be removed. I applaud the efforts to thwart spam, malware etc but its hard when you strive to run a legitimate site and get hit hard when hackers strike.

  • 2006/12/06 at 11:32 am
    Permalink

    It’s in no one’s hands to not to get hacked. Of course every one is wide awake and keep eye on their sites for any kind of spamming. But sometimes it happens, missing something out of your eye. The main thing is getting the issue solved.

  • 2006/12/06 at 11:40 am
    Permalink

    The more ethical policy would be to just ban the one page or Pages that do violate webmaster guidelines – BUT NOT THE ENTIRE SITE.

    Supposed a hacker had just created one obsure page and got that page on Google – and did not link to it from any of the other pages on the site.

    With a large site -it would be virtually impossible to tell WHICH page had been spammed, and got you de-indexed.

  • 2006/12/06 at 7:26 pm
    Permalink

    Isn’t funny how some posters act like their sites are SO locked down and will never get hacked when we ALL know that isn’t true.

  • 2006/12/06 at 8:06 pm
    Permalink

    Hi All,

    I’m a bit sad to see that some people are rude with the owner the cracked web site. If you put one instant in the place of someone who has to have such cracking situation, you’ll see it’s not a typical situation and it’s definitively difficult to deal with.

    Please let’s consider that there’s no system that is secured at 100% and a team of motivated and patient crackers will certainly be able to crack any popular systems.

    I know what I’m talking about as I was formally a IT Architect & Security consultant for years and I was the guy who had to design and put in place in 2000 the Secure and Private Internet Services of S.W.I.F.T. (http://www.swift.com). S.W.I.F.T. is the institution that requires a high level of security as they have to be the telecommunication infrastructure providers for all bank in the world to exchange banking message (the famous SWIFT BIC ID/CODE).

    So everybody may be in the mercy of a cracking. Even Google is not imunized again cracking:
    For instance: http://www.pcworld.com/article/id,127454-page,1-c,google/article.html
    BTW: Strange URL: http://www.google.com/intl/xx-hacker

    Now if we come back to this discussion on this blog.

    I think Austringer is among the priviligied ones were alerted (or should have been if he was really alerted) when his web site has been blocked from Google index.

    I don’t want to cover the exact reason but at least I want to point out that I used to get some web pages delisted from Google and I was never alerted about that. It’s not an isolated case.

    So for my part, the real issue I’d like to talk is more about the reason why the Google Webmaster service alerts some people when their web site were penalized but not others.

    I used to get some of my web pages excluded by Google index. I was never alerted by Google Webmaster service.

    When I tried to understand what’s going on with the Google Webmaster tool, I just have the message that stated my web sites is not indexed by Google; which is a little bit limited and didn’t give me any clues of the reason.

    Of course, I have to analyze what happened during the last couple of days and to conclude myself that the web page was removed because I did some 3-4 promotions through some hi-PR web sites and/or because I’ve submitted to several directories within a short time. OK I understand I may have gone against some Google T&C by doing some white hat promotional activities and I assumed the consequenced of being delisted for some web pages.

    However, even if I don’t expect the exact reason to be given by the alert for the penalty, it would be so helpful and useful that I could have been alerted by just a simple message about my web penality or blocked.

    Unfortunately, the real thing is Google Webmaster tool doesn’t send the message about penality at least to me and same of several of my friends.

    It’s very frustrating and after such things happen more than once, some of friends finally asked my help to develop a kind of monitoring to check the existence of a specific web site in Google index, a cron task that is scheduled every 15 minutes.

    Sure we’re not alone to be obliged to do so. It’s the natural way of adaptation. It’s just about creating some comfortable ways to ease the life of webmasters. If Google doesn’t give that, we’ll build it ourselves.

    So to come back to the main question, why don’t you systematically alert the web site ownership (e-mail registered and ownership approved) when there’s a penalty that happens and the web site or some web pages are removed.

    Thanks.

    VK

  • 2006/12/07 at 11:29 pm
    Permalink

    Wes, there are a number of posts to the TO feedback re: google. I hope you could respond to at least one with a summary comment.

  • 2006/12/08 at 12:22 am
    Permalink

    I’ll be posting something to the newsgroup. There will be news.

  • 2006/12/31 at 10:14 am
    Permalink

    Dude,

    Why don’t you read google’s rely about this situation:

    http://www.mattcutts.com/blog/how-google-handles-hacked-sites/

    1) They did try an email you

    2) They reported the problem in their webmaster control panel.

    Seriously if you get hacked and have NO idea 10 days later you are an idiot and should take your web server and throw it out the window.

    A bad webmaster / system admin should know about and be able to resolve any attack in less then one hour, I can’t believe you are complaining so much about it. Its your fault, and Google tried to help you, even then you where too dumb to get it!

    I get tired of cry babies like you on the internet.

  • 2006/12/31 at 9:27 pm
    Permalink

    Dude,

    Why don’t you read the reply at Matt Cutt’s blog? If you had, you would know that not only did I read it, but I was an active participant in the comments on that thread. Also, you might have read Matt’s comment there saying that he agreed with some of my points…

    I think that will do. The New Year approaches, and I would prefer to have the old one pass with a smaller serving of vitriol than a larger.

  • 2007/01/22 at 3:05 am
    Permalink

    I can’t believe I’m reading these disrespectful and rude remarks telling you it’s “your fault” for “allowing” a hacker to deface your site! That’s like blaming a rape victim for wearing shorts. How disgusting. I’m with you, Wesley. While it has provided massive opportunity, Google is also capricious and frightening in its policies of hammering the “little guy.” Since Google IS about as indispensable as the phone company, it needs to start providing better customer service – especially to those of us who are handsomely lining its pockets through Adsense and Adwords.

    Thanks for the good and necessary blog, and I’m so glad to see that Google/Matt Cutts actually responded to you, since your site HAS been around forever and has provided a great service (I recall reading it way bay when).

  • 2007/03/03 at 11:59 am
    Permalink

    I don’t seem to have been hacked, but I was delisted from Google. They show only 10 old static pages of mine, and none of my blog content. What’s worse, they now show -0- incoming links.

    And there’s no information anywhere I can find out why, nor any instructions on how to fix it.

  • 2007/11/21 at 9:06 pm
    Permalink

    Seems to me that it’s natural selection. Google kills the stupid. No reason to complain over a blatantly obvious truth. Your appeals to a higher power at Google are clearly the sign of an inferior intellect.

  • 2007/11/21 at 9:26 pm
    Permalink

    Nope, Michael has a point. Google intentionally keeps information transfer to a minimum to avoid gaming its system. There are arguments both ways, as has been covered earlier in the thread. We don’t want a world in which crackers and web spammers end up manipulating a search engine like Google. Nor is it in the best interests of anyone that legitimate administrators be left completely in the dark as to what is going on. How to balance those two conflicting issues has not not yet been satisfactorily addressed.

Comments are closed.