In a post over at Ycombinator, the one and only Matt Cutts, offered a short but definitive answer to the question about the sites built on Amazon’s Web Services. The poster states “It’s still not hard to slam some AWS-related keywords into Google and get these bogus results, though” and Matt’s response? “Someone else already reported this. There’s been some weird stuff going on with AWS-type pages, e.g. see http://news.ycombinator.com/item?id=2103401 for example. I don’t know the exact cause, but I know the indexing team is aware of this issue and working on it”
What exactly this means and when I don’t know but there is no doubt that Google has taken notice and is in the process of taking a harder stance on Amazon clones. The business model of many a mini site developer is about to change. We saw Epik lose most of their sites to de-indexing and many of them still are not part of Google. I think it was a taste of things to come for the quick site.
On another note, Matt also hinted at the possibility that Google would soon allow users to delete a domain name from their personal search. A trend that could let Google track user’s blacklisting to create their own database of dislikes. I’m not sure I like the idea and this comment on the site summed it up well
It depends on what Google does with the data. Since you have to be logged into your Google account to do this my guess is the Google team will use these blacklists to gather data about commonly reported spam sites. If this is the case, then the potential for abuse is VERY HIGH. I can hear my old boss now – all you need to do is hire a team in the Philippines, India, Russia and a few others – disguise their IPS and then let them go… (he was talking about a different situation, but the same thing could work here) the work would be done by real people, in the thousands, over what would look like natural trends.. and for not a lot of cost (we are talking comparatively here)
So again depends on what Google does with the data, but I can already see the spammers in the highly competitive markets strategizing on how Google might use this data and how they could easily scam it – because anything that uses user input is soooooooooo easy to manipulate – just ask Twitter.
I think he is just saying that some spammer is ranking well for ‘aws’ keywords and not sites built on AWS. I think he is talking about a specific case …
“An enterprising spammer has won huge on Amazon Web Services (AWS)-related keywords”
He has also given the Google Search Query for this case…
That was the point of that particular ycomb. thread, but there have been a lot of threads about the trouble with google recently. I gathered that the problems fell into a few main categories. Including:
1) sites that scrape content – both experts-exchange and e-freedom are examples of this. Virtually 100% of the content on those sites are data downloaded from stackoverflow. As far as I know it’s not strictly illegal (as stack’s license allows for public use) but it definitely annoys a lot of people who expect people to act with integrity.
2) “content farms” – sites like eHow, wikihow, mahalo – based on creating huge amounts of very low quality content in order to boost their traffic. Often crowd sourced – Calcanis for example pays people in “mahalo bucks” for the articles they write, I think eHow and others *might* have similar profit sharing…
3) “Review Sites” that exist as little more than shells for affiliate offers, or wrappers for adwords – throw a handful of ezinearticles up and watch the adsense roll in.
I don’t know that an amazon-store is necessarily a problem. For people looking for “triathalon bikes” it’d probably be considered a bit of a service that they found a well laid out list of exactly what they were looking for.
Shane – editorial note. Only a minority fraction of sites were de-indexed across the network. Certain portfolio holders were more impacted, depending on SEO practices used for individual portfolios.
As discussed on the Epik blog, and reviewed during a webcast with owners earlier this month: (1) indexing is improving, and (2) we have an aggressive innovation program for making product portals more effective.
Comprehensive development works. This much we know for certain.
Mass development also works, but does require some amount of engagement, either from users or owners. Epik is out in front in addressing both types of engagement.