FAIR Health products are based on a comprehensive, proprietary database of billions of healthcare charges:
- 15 billion billed medical and dental services which are continuously expanded and refreshed
- Data represent over 126 million insured plan participants and family members covered by health plans that participate in the FAIR Health data contribution program
- To protect privacy, all patient and other identifying information has been removed from claims
- Organized by procedure code, geographic area and date of service
- The United States is divided into 491 distinct “geozips” providing meaningful charge data at the local level
- Geozips are generally defined by the first three digits of a zip code – for example, zip codes 44301, 44302 and 44303 in the Akron, Ohio area are all part of geozip “443”
FAIR Health was established to bring fairness and transparency to health insurance information by shedding light on what was often described as the “black box” of out-of-network reimbursement. As part of its core mission, FAIR Health is dedicated not only to detailed, unbiased charge information, but also to revealing the methodologies used to collect, compile and array the data. To this end, FAIR Health consults several advisory groups to review the statistical methodologies employed to generate its products and validate data contributions before they are added to the database.
Methodology and Oversight
FAIR Health works with the Upstate Health Research Network, a consortium of health researchers from leading universities in New York State and around the country, to recommend and review the statistical methodologies that underlie the database.
A national Scientific Advisory Board of prominent experts conducts a detailed review of the statistical procedures proposed by the UHRN. Before any change to FAIR Health methodologies is introduced, the Scientific Advisory Board must recommend the change to the FAIR Health Board of Directors and it must be formally approved by the Board.
The methodologies employed in compiling the database and producing FAIR Health products are subject to continuous review. Revisions are based on recommendations from FAIR Health expert advisory panels. FAIR Health is committed to providing transparency into how the data in the database is collected and compiled. Visit More about the Database for detailed information on the statistical methodologies used to generate FAIR Health benchmark charge products.
Rigorous Quality Assurance Every Step of the Way
FAIR Health has established a thorough process to help ensure the integrity of the data in its database and products. Data are carefully validated before being accepted through the FAIR Health data contribution program. Each release of FAIR Health products is examined by in-house experts in statistics and technology and audited and validated through a comprehensive external review process.
More about the Database
As part of its commitment to bring fairness and transparency to health insurance information, FAIR Health provides detailed information about the validation techniques and statistical methodologies it employs to generate the products based on its database.
Data contributed through the FAIR Health data contribution program undergo a rigorous validation process before being accepted into the database. Each data submission is carefully screened; data points that fail to meet quality standards are rejected:
- Invalid procedure codes – Codes that do not match those provided by the AMA, ADA, ASA or HCPCS
- Dates of service that are too old – Charges that do not fit within the rolling 12-month period for data used in product modules
- Invalid zip codes – Data associated with erroneous zip codes or zip codes that do not correspond to provider offices or locations where services are performed (e.g., special zip codes that correspond to a single government building)
- Outliers – To distinguish erroneous charges from those that are true outliers, FAIR Health employs a Median Absolute Deviation (MAD) algorithm to identify and eliminate invalid charges. This is a standard statistical method applied to data that do not fall along an evenly distributed bell curve. A detailed explanation of the MAD Algorithm can be found in the Resource Center.
In addition to data validation, FAIR Health employs statistical methodologies to compile and array data within its products:
- Blended Methodology – FAIR Health products in the FH Benchmark series generally reflect actual charge data. The FH RV Benchmark product modules (and some of the FH Benchmark series modules) utilize a blended methodology that takes actual data as a starting point and applies statistical techniques including relative values and conversion factors to derive benchmark charge amounts. This methodology extrapolates values for low occurrence and new procedures in addition to providing values for common procedure codes.
- Relative Value Scale – FAIR Health uses relative values to produce the FH RV Benchmark modules as well as to populate some of the FH Benchmark product series. Relative values take into account the work performed by the provider (time, intensity, level of skill and training required), the expenses related to running the practice, including rent, equipment, supplies and cost of non-physician staff and the cost of professional liability coverage.
- Data Array – In the production of FAIR Health product modules, charge data is grouped into geozip/procedure code combinations (sometimes called “cells”) and arrayed into percentiles based on where the charges fall along the cost spectrum. For example, 80% of all charges for a given geozip/procedure combination are equal to or lower than the amount displayed in the 80th percentile.
- Low Frequency Data (Small Cell) – For some procedure/geozip combinations, there may be few occurrences of data. That may occur because the geozip is a rural area with few providers offering a particular service or because a procedure is performed rarely. In these instances, there may not be enough data to provide statistically reliable information. These cells are called “small cells” and a small cell methodology is employed to “fill” the cell with statistically relevant data.