There are a few issues to be cautious with when using HPI data. Various data sources provide a variety of products that utilize the fundamental principals of benchmark pricing. I came across a couple of examples today that could present, in a best case scenario, some puzzling results. Although this post is generally geared towards other appraisers, realtors or those who are generally very familiar with the housing market and available data analysis tools, it may provide some insight for those interested in understanding the HPI or benchmark prices as they may be referred to in publications and media.
I have previously found anomalies in our MLS system HPI data, brought it to their attention, and had it corrected so what I am presenting today I suspect is, in part, some type of data anomaly. There can also be some significant differences in the published data for 1 story vs 2 story. There will be obvious differences in other types such as apartment or townhouse but those are generally expected and not often considered as possible comparable sales. The HPI database will attempt to divide the numerous types of detached single family designs into either 1 story or 2 story so that can also present some challenges depending on the types of homes being contemplated.
In some markets buyers or analysts may never need or want to cross over the analysis of a 1 story home with a 2 story but what about a Split Level and a 1.5 story? The HPI will segregate the data into one or the other. What about a 1 story with a full walkout basement versus a 2 story. In many cases, a buyer may consider either and they may actually be in a similar price bracket depending on the utility and appeal provided by the basement level. I understand the reasons for why the creators of the HPI data model would need to combine data. In many markets there just is not enough data to provide consistent trends for each design style. Most markets (whether that be a regional market, or neighbourhood, or building type etc.) tend to rise and fall together, or in other words, trend together. Compiling types together to create more consistent trends is a perfectly acceptable solution given the alternatives to limited data. How that is applied in the real world could have some significant impacts on the results or conclusions applied to a subject or comparable of interest.
Many MLS publish tables of this data while others publish interactive graphics like ours. It doesn’t make a big difference except that if we were not graphically analyzing our data, these issues buried within a table could possibly go unnoticed or become unwittingly imbedded in the users analysis or conclusions or the differences that aren’t issues per se, could be less obvious while still being a perplexing cause for concern. The next image is the HPI showing almost a 20% difference between Dec 2022 (1 story) vs Jan 2023 (2 story). I did chose those dates as they are an extreme example but a completely possible scenario which may cause issues or confusion if not fully considering all of the parameters.


The next image is where I suspect there is some type of error or anomaly in another data set which is significantly skewing all of one Major Market Area and its Submarkets Areas. The issue is most notable in Jan 2023 but also Nov 2022. These are just the Single Family which is the average of 1 story and 2 story. The disparities were even greater when considering the different styles but graphically the display was difficult to read so I chose the combined set for this example. The red blue and brown trend lines (or the bottom three noted in Jan 2023) show somewhat more moderated trends. Coincidentally, or not, they are generally larger markets with more data. I would find it very unlikely that Benchmark Prices would shift that much (close to 10%) over a 2 month period Nov 2022 to Jan 2023 for the NW Salmon Arm area alone but particularly not the entire Shuswap Revelstoke.

It is interesting to note that the entire region of Shuswap Revelstoke and sub-neighbourhoods are all trending similarly and contrary to all of the other major areas. The data analyst needs to make a decision about this apparent anomaly and having alternate data sources they can explain and rely on can be critical.
Just to show that these are not written in stone and are subject to error or issues, the next image is one I captured in Feb 2022 when I noticed a significant drop in the data in the September 2021 period that had no real explanation in the actual market from my observations. There was no significant change in the market that should have caused that drastic a change for all of those markets together. I brought that to the attention of the MLS association and after researching the issue they determined there was a problem that they would resolve. By my timelines that anomaly existed in the data for about 8 to 9 months before it was resolved.

I just looked that up again today and it appears they did make some updates but I was never informed as to what would have caused that.
