Deborah Madey Broker

Peninsula Realty Group (732) 530-6350

Does NAR think Google is an Enemy? Making Sense of IDX Policy

THE RULE {in the Enhanced IDX (Internet Data Display) Policy} …… Participants must protect IDX information from misappropriation by employing reasonable efforts to monitor and prevent “scraping” or other unauthorized accessing, reproduction or use of the MLS database……

In order for Google to provide search engine result pages (SERPs), Google must “crawl” websites and “index” the information it finds. Google does this for billions of website pages.   When the above rule was adopted in 2005, IDX data was delivered to Participants (REALTORS®) only in “framed” solutions. In these “framed” solutions, Google is unable to “crawl” and “index” property information. Technology has evolved and IDX Solution Providers can now deliver “crawable” and “indexable” compliant previews of MLS data. It is this function of “crawling” and “indexing” of the actual MLS data that has changed the landscape. Google can now see and deliver search results based upon IDX data such as MLS numbers and street addresses. This generates a need for the above rule to be clarified or amended in order to provide clear guidelines.

UNDERSTANDING & DISECTING THE RULE:

When Google crawls and indexes IDX Participant websites, is this a violation of the above rule?

Is this a misappropriation of the IDX information?
Is “indexing” the same as “scraping”?
What is scraping?
Is this an unauthorized access, or use of the MLS database?
What was the original purpose of the rule? Does that need continue to exist?

FURTHER QUESTIONS TO EXPLORE:

Is indexing good for buyers? for sellers?
Do IDX Participants (REALTORS®) favor or oppose indexed MLS data? Why?
Who benefits if IDX data is indexed?
Who opposes the indexing of IDX data?
Is the prohibition of indexed MLS data a restraint of trade?

SHOULD THE RULE BE:

Left in place and clarified?
Amended to reflect changes? Abolished?

Wherever the term “Google” appears, it shall mean any search engine, including, but not limited to Yahoo.

THE STORY OF PAULA HENRY AND JAY THOMPSON

Paula Henry, a REALTOR®, with Red Door Real Estate, had her website reported for allowing the MLS data on her website to be indexed. The Indianapolis Metropolitan Board of REALTORS®, (MIBOR) issued a cease and desist order to Paula and her broker for both of their sites. If they failed to comply, their feeds would be cut. The entire decision rested upon the fact that the MLS data on Paula’s website could be indexed. The information represented was compliant with all IDX rules. Paula posted her story to a blog on Agent Genius and the subject gathered instant attention with hundreds of comments. Cliff Niersbach, Hilary Marsh, and Todd Carpenter from NAR became engaged online and through emails. This led to NAR’s invitation to Paula Henry and Jay Thompson to address the Multiple Listing Issues & Policies Forum & Committee at NAR Midyear. Jay Thompson is the Broker-Owner of Thompsons Realty in Arizona, and is well versed in the data feeds, blogs and IDX rules.

Paula Henry’s site:   hometoindy.com
Jay Thompson’s site: http://www.thompsonsrealty.com/

UNDERSTANDING & DISECTING THE RULE

When Google crawls and indexes IDX Participant websites, is this a violation of the above rule?

What is scraping? What is indexing?
Is “indexing” the same as “scraping”?
Is this a misappropriation of the IDX information?
Is this an unauthorized access, or use of the MLS database?
What was the original purpose of the rule? Is the rule relevant today?
So, what is the problem?

SCRAPING is often viewed as controversial. Scraping is a technique of extracting data from a website through the use of a bot (software robot.) The extracted data is stored, often studied, manipulated or reformatted. Not all scraping is malicious. Scraping has been challenged, both successfully and unsuccessfully, as illegal. It is frequently a violation of a websites Terms of Service (TOS). Not all scraping is evil. Comparison shopping sites scrape data and display pricing data from multiple sources on one website.

INDEXING a website may be compared to making an index of words and placing this list of words in the back of a book. Google “indexes” for the purpose of being able to bring up the most relevant pages for the user, similar to a giant dewey decimal system. Data is cached for quick retrieval only. Data is not manipulated.

INDEXING is not SCRAPING. While a few individual may define scraping and indexing as the same, the overwhelming majority will report they are not. Both are processes of gathering information for websites, but the actions taken with the data bear no similarities. While not all scraping is ill conceived, it is often predisposed to be bad, while indexing is considered to be good. Both use bots (software robots) to crawl websites.

MISAPPRORIATION? UNAUTHORIZED ACCESS? If indexing is good, and scraping is bad….but both use a bot (software robot) to access websites…..is Google’s access of websites is through the use of a bot a misappropriation or unauthorized access?

WHAT WAS THE ORIGINAL PURPOSE? It appears that the rule was written to protect the IDX feed from malicious offenders.

IS THE RULE RELEVANT TODAY? The need to protect data from manipulation remains, but this rule fails to provide that protection.

SO WHAT IS THE PROBLEM? The current rule not only fails to protect the MLS database from malicious manipulation, the ambiguity concurrently leads to inconsistent decisions in enforcement, .and it’s enforcement can be unduly restrictive against Participants (REALTORS®)

Comments from Cliff Niersbach, NAR’s Vice President of Board Policy & Programs:

“The Center for REALTOR® Technology (”CRT”) advised that while the intent of “scrapers” may be malicious, and the intent of “indexers” good, the two practices from the Web server’s view appear to be the same.”

“It should be understood that the focus of the rule in question has never been on blocking indexing by search engines. That potential effect was not contemplated when the rule was adopted by the NAR Board of Directors in 2005 and only came to light in the past few weeks. Simply put, the issue is whether – and how – indexing by search engines can be accommodated while at the same time clearly and objectively distinguishing that functionality from the scraping the IDX policy prohibits to protect MLS databases from misuse and misappropriation.”

FURTHER QUESTIONS TO EXPLORE

Is indexing good for buyers? for sellers?
Do IDX Participants (REALTORS®) favor or oppose indexed MLS data? Why?
Who benefits if IDX data is indexed?
Who benefits if IDX data is not indexed?
Who opposes the indexing of IDX data?
What about Third Party Aggregators? (3PA)
What about Realtor.com?
Is the prohibition of indexed MLS data a restraint of trade?

BUYERS search for real estate from search engines. Buyers enter addresses, key words, and occasionally, even MLS numbers. Buyers believe that Google will direct them to the site which can best provide details in response to their most relevant search terms they entered. Indexed MLS data helps buyers find their targeted data faster. Buyers also find indexed data on the sites of third Party Aggregators (3PA.) Indexed data, when it leads to an accurate detail page, is good for buyers.

SELLERS generally want the most expansive exposure possible for their properties. Sellers want buyers to find their properties. Indexed data increases exposure for the seller and makes it easier for the buyer to locate. Given a choice between having their properties listed on a Google Search Engine Results Page (SERP), sellers would overwhelmingly say yes.

REALTORS® are divided in their positions on indexed IDX data. Many REALTORS® do not have websites or IDX solutions. REALTORS® who do have websites and IDX solutions may not understand indexing, SERPs, crawling, or scraping. Consequently, there are REALTORS® who are unable to voice an opinion because they lack a clear understanding.

REALTOR® SUPPORT The most progressive REALTORS® who maintain websites and utilize technology to meet client expectations strongly support query compliant previews of IDX MLS data on REALTOR® websites, including identification of the Broker or Agent as required.  REALTORS® who support indexed IDX data favor transparency in real estate, synergistic with the goals of consumers.

REALTOR® OPPOSITON REALTORS® who oppose indexed IDX data fear loss of control over their listing and possible leads. This preference may be in conflict to the best interest of the seller.

THIRD PARTY AGGREGATORS (3PAs) Third Party Aggregators invest heavily to achieve high Search Engine Page Results (SERPs). Not only do they allow their data to be indexed, they concentrate on search engine optimization with a high degree of emphasis on listing data, MLS numbers, and street addresses. These 3PA’s aggressively invest in these results since it is considered the key to attracting the consumers. This fact supports the value for the accurate and compliant IDX data which is indexed. Many 3PAs gather their listings from Brokers. Some MLSs do feed listings to 3PAs; examples include ListHub and Trulia. If an MLS provides a data feed to a 3PA that allows Google to index their site, how and why would a rule exist that would suggest an IDX Participant be prohibited from presenting IDX data on their site than can be indexed. In both instances, the source of the data is the MLS. In both instances, the data appears on a website……and it can be seen read and indexed by Google.

REALTOR.COM is provided MLS data feeds from across the country and Realtor.com allows Google to index MLS numbers, street addresses, etc Whether the data feeds received by Realtor.com are categorized as IDX feeds or defined under different terms, the end result is that MLSs provide advantages to Realtor.com and deny the same to dues paying members. A quote from REALTOR® DR: ““On a side-note: the fact that Realtor.com uses the same data that brokers use, from the same source (the MLS of the local association), but are not required to adhere to the same rules, is UNCONSCIONABLE. It is completely indefensible. The fact that our National Association controls the use of its data by its paying members, but not by a private company, is a disgrace.”

WHAT ABOUT THE BAD GUYS? The purpose of the rule was to protect the database from falling into hands which may manipulate, convert, alter content, copy, distribute or otherwise misappropriate data. The data already exists on the web in a format which can be scraped. The rule no longer provides that protection.

AN EXAMPLE OF SCRAPING The following site was created solely for the purpose to illustrate how quickly and easily MLS data could be scraped from Realtor.com     http://www.retechulous.com/category/nar-blows/

RESTRAINT OF TRADE? A rule which restricts IDX Participants (REALTORS®) from utilizing indexable data feeds only restricts the dues paying members. Those who wish to scrape already can and do. A Participant (REALTOR®), as well as a seller, might deem a rule an excessive restraint if denied the use of indexable MLS data.

WHERE DO WE GO FROM HERE? Prohibiting an IDX Participant (REALTOR®) from placing indexable MLS data on their website will not unlock the door for scraping. The barn door opened and the horses are all out of the barn. The task is managing the data that is out there, and promoting to the pubic the value of obtaining accurate information from REALTORS®. The rule needs to be modified to reflect the current use of technology, along with the benefits and challenges it presents.

Read the Agent Genius Blog that gathered 300+ comments where Paula Henry first posted her blog on this subject: http://tinyurl.com/o883yk

To reach Paula Henry directly: Red Door Real Estate C: 317.605.4174 Paula  {at}  HometoIndy.com

To reach Deborah Madey directly:  Peninsula Realty Group,   Deborah (at) PeninsulaFirst.com   732.530.6350