This posts covers a couple miscellaneous items I wanted to get out before doing the final post in this series.
“Static pages” and “RESTful development”
Mike Wurzer pointed out that my post on how IDX index fishing works is technically inaccurate. I asserted there that IDX index fishing requires the creation on the IDX site of static pages showing all the listing information. He explained that current web development methods are focused on creating sites that are easily indexed, and that they tend to include links (or what I have previously called ‘sitemaps’) that link to dynamic content. In this way, they get the benefits of having static, indexed pages, as I described them in Part IV. He described this as “REST” (which stands for “representational state transfer”). I did a little (admittedly, a very little) reading about “RESTful” development on Wikipedia. I also visited a number of web sites that deliver database-driven content; RESTful techniques appear to be in wide use throughout the web.
Based on these facts, I’m prepared to revise the implications of my earlier comments. There I said that “IDX sites are not naturally prone to be indexed by web search engines” and that “some brokers with IDX sites are intentionally using the listings of other brokers in IDX to fish for indexing.” The implication is that brokers using these techniques are somehow departing from the norm. In fact, after Mike’s helpful comments and a little research of my own, this appears to be the norm in the design of sites driven by database content. I’d be curious if anyone can offer evidence to the contrary.
Users posing as Google robots
Victor Lund made a comment on an earlier post noting a blog post that purports to show how anyone can pose as Google and thus index a site, even if it’s behind a registration (which could have impacts on VOWs, if true). I reviewed the post, and it appears that the site has to allow this conduct. In other words, an IDX site that required registration before making certain content available, or a VOW, which is required to make visitors register and login before showing them VOW content, would prevent the technique described there. (The comments to that blog post clarify it.)
In any event, with IDX index fishing, we’re talking about broker sites that want to be indexed, not ones that are avoiding it. Under the VOW policies, a VOW-operator may not allow the VOW to be indexed by web search engines.
More IDX index fishing may make it ineffective
Mike Wurzer pointed out on an earlier post that permitting IDX index fishing may actually lead to the method becoming less effective. This, as it turns out, may be a very important point.
Web search engines like Google apparently have “duplicate content filters” – these are designed to screen out web pages that contain content very similar or identical to pages on other web sites. The web search engines are trying to prevent what some call ‘search engine spam’ – dressing up the same content on multiple sites in order to increase search engine rankings or traffic.
I found several easy-to-understand summaries of duplicate content filters. (Here’s one, though I don’t claim it’s accurate.)
The upshot: If IDX index fishing techniques appear on every broker IDX site, or even most of them, they may find that all their pages are being filtered as duplicate content by the likes of Google. Consequently, if IDX index fishing becomes widespread, it may also become largely ineffective. (Of course, I expect clever web designers and SEO contractors will look for ways around this ‘problem.’)
On the other hand, according to an article on the topic on Google’s Webmasters/Site owner help, it looks as though Google may try to ‘pick a winner’ among the many sites that have duplicate content. In that event, I’d be worried that one broker with clever SEO might effectively monopolize the top spot. Such an advantage is still likely to be temporary, though, as other brokers, SEO experts, and aggregators like Realtor.com will be spending time and money to overcome it.
One more copyright issue
In response to my post about the legality of web search engines indexing web sites, Rob Hahn posted a comment to which I want to respond. Here is an excerpt:
Something I’m curious about… is the difference between the Kelly case… and the case of listings. A listing, after all, is more or less a compilation of facts about a property. It isn’t a creative work, like an artistic photograph. Furthermore, what of the relationship between the seller and the listing agent? Supposedly, the listing agent is a fiduciary of the seller, whose home is the ultimate property at issue. Wouldn’t the seller hold the ultimate IP to descriptions, likeness, facts of the house, and any grant of license to the listing broker is dependent on the seller? It’s a tricky area, but how would you go about unravelling the original and base IP at issue — that of the seller? How does that impact the whole IDX/Search issue? I rather think there’s an impact here but maybe I’m overthinking it.
It’s important to distinguish the rights of the seller under the listing contract, as the owner of the property and perhaps principal in an agency relationship with the listing broker, from the rights of the listing broker/agent in the listing record
under copyright law as author of creative content. Copyrights in the listing photographs and textual descriptions of the property belong to the human being who created them – usually, but certainly not always, the listing agent. (There are some exceptions – probably not important here.) And photos and descriptive text of the kind common in MLSs are both creative works from the perspective of copyright law. (So too is the “compilation” of the facts in the MLS database, but that’s a topic for another post.)
Generally, then, the seller would have intellectual property rights in the listing record only if the listing contract (or some other writing) transfers them to the seller from the listing broker/agent – or, of course, if the seller had created the content in the first place.
We could spend a long series of posts on copyright issues in the industry, but I think readership would drop dramatically in that event… Maybe I’ll see if I can summarize in a single post somewhere down the road.