Proposed CMS data changes risk major research fallout
There is still time to find a better way to keep data secure
The Centers for Medicare and Medicaid Services (CMS) recently announced a plan to change data access protocols and fees for researchers using Medicare claims data, reportedly due to data security concerns. In practice, the changes announced would make data access more expensive and eliminate institutional investments in data infrastructure made under the current access regime. This would reduce the volume and quality of healthcare research, which would limit information available for policymakers. Instead of proceeding with the announced changes, CMS should work collaboratively with stakeholders to identify a less disruptive approach to ensuring researchers can access data while maintaining security.
CMS plans to restrict data access further and raise fees
Medical billing data are sensitive, and access has always been highly controlled. Under the current system, universities and research institutions purchase individual data files and can reuse them for subsequent projects at low cost. Data purchasers are responsible for setting up their own secure environments in which to store and analyze the data. In addition to their own costs for establishing this infrastructure, institutions pay CMS for access to the data. The costs range from over $100,000 for purchasing all of the data generated in a given year, down to zero for graduate students re-using data their institution has already acquired.
CMS announced two major changes to the data access regime. First, researchers would be required to access data through servers controlled by CMS in a “virtual research data center” (VRDC). With this setup, the past investments in secure data systems would no longer be usable. Instead, all research would occur virtually in the VRDC, relying on CMS’ likely inadequate computing resources, with software insufficient for cutting-edge research. This would make some of the best projects entirely infeasible, and delay others substantially.
Second, the fees CMS would charge in this new regime are at a different scale. A project that might have cost as little as $2,000 (or zero for a student) would easily cost hundreds of thousands of dollars in this VRDC. Across a whole institution hosting multiple research projects, costs for acquiring and re-using data were previously on the scale of hundreds of thousands of dollars. Under the new regime, they could reach millions or even tens of millions.
CMS cites “growing data security concerns and an increase in data breaches across the healthcare ecosystem” in announcing their changes. They do not discuss alternative ways of improving this security while protecting research and institutions’ existing investments. To our knowledge, CMS did not seek input from data users before announcing this plan, but many have offered comments and suggestions since the announcement, and engagement with their perspectives could still help balance security/privacy and usability interests. CMS took a step in that direction last Friday by requesting input on the structure of fees and the timing of its proposed changes.
Research using these data has been valuable
The research in question tackles critical problems for CMS programs and for patients. To do this, researchers rely on hard data about the programs and their patients. Using Medicare data, researchers have shown that long-term care hospitals waste $4.6 billion per year with no detectable patient benefit. They have shown that durable medical equipment could be purchased 42% cheaper, but reducing prices incorrectly leads to equipment shortages. Detailed patient-level data from both Medicare and Medicaid allowed researchers to measure how payment policies create disparities in access to primary care. One of our colleagues combined Medicare patient data with audit records obtained by FOIA to find that hospital audits have an extraordinary financial return for taxpayers. Others have measured the quality of hospitals, of insurance plans, and the effectiveness of quality improvement programs.
The impacts of this research on healthcare policy, and real patients, are profound. Research with CMS claims data laid the intellectual foundation for Accountable Care Organizations and other types of performance-based payment models that are now central to health policy.
Planned changes would have devastating consequences for research and researchers, especially students
Researchers in less-well-funded institutions, and especially graduate students and early-career scholars, will be most affected by these changes. Under the current system, an institution could purchase data for one project, then other researchers could reuse the data at minimal cost—and, for graduate students, at no cost.
This facilitated innovative work and experimentation with new methods and ideas, contributing to the creativity and success of recent research programs in health economics. Critically, the current system does not have any new fees for additional researchers to access the data on an existing project. Once a project is approved, new researchers can join by signing confidentiality agreements and taking the appropriate trainings. This encourages student participation, extra quality checks, and creative new approaches.
Costs for a small research project might currently require only a $2,000 “data reuse fee.” Under the new system, this project’s costs would generally reach into the six figures. Even those who have enough resources to make this viable will be unlikely to embark on creative projects or those with less certain outcomes. The long duration of such projects (with fees multiplying each year), combined with their low success rate, would discourage researchers from trying new ideas. Rather, the proposed cost regime favors the easiest and most predictable projects, affordable only by the most established researchers. This would stifle the creativity that has led to influential and important healthcare research in recent decades.
High fees for re-use, and for adding researchers to a project, would make it difficult for graduate students to use CMS data, cutting off the pipeline of researchers with expertise in this critical area. Most importantly, if students shy away from studying these programs, we would have fewer creative early-career researchers detecting Medicare fraud, measuring the impacts of pre-authorization in Medicare Part D, and coming up with ideas to save money or improve patient care.
A fee schedule that more closely matches CMS’s true cost structure could address some of these concerns. For example, having additional researchers join a project imposes negligible additional cost on CMS so should be priced accordingly. Additional data storage fees of $1,500 per terabyte per year are hard to justify when buying such storage from BestBuy costs only 2% of CMS’ charge. CMS should consider lower costs for graduate students and early-career researchers, in light of these groups’ reduced access to resources and high potential for future contributions.
Changes would also undermine institutional investments in analyzing claims data
Many universities and research institutions have made significant investments in using CMS data under the current access regime, in which re-use is relatively affordable. A comprehensive dataset would have cost millions of dollars over the years, on top of the investments institutions make in their own computing infrastructure, compliance, and researcher training. Having made these investments, researchers continue to use them repeatedly and effectively. Much of this infrastructure also reflects taxpayer investments, having been financed in part by grants from the National Institutes of Health and other agencies. The proposed changes would wipe out these currently productive investments. Combining these costs across projects, many institutions would face costs ten to hundreds of times higher for accessing the data that they thought they had already purchased.
Revoking access to previously purchased data undermines trust in the agency. Institutions, researchers, and funders made these purchases under the premise of availability for future re-use. CMS should consider past commitments, whether implicit or explicit, and do its best to honor them while ensuring data security. Recognizing the investments scholars have made in ongoing research projects, CMS should allow a prolonged transition period in which existing systems can be maintained (perhaps with additional security protocols). Research projects can last multiple years, and CMS should correspondingly allow a multi-year transition period in which existing data users can maintain the present methods for accessing data.
CMS should collaboratively seek a less disruptive way to safeguard data
There is always tension between the competing values of keeping data secure and using data to generate useful insights, and views may differ on how to strike the appropriate balance. Agencies rarely address this tradeoff explicitly, but changes in access conditions and fees implicate it directly. In practice, the proposed changes may improve data security, but in an extremely blunt way that would have devastating consequences for data utility: by making access costs and conditions so onerous that the number of data users shrinks dramatically.
There are likely better ways for CMS to strike the balance between data security and research. For instance, if some institutions are not following sufficient security protocols, CMS could monitor those institutions’ systems and policies more carefully. If a move to centralized access is ultimately necessary, it should be delayed and the time allowed to complete the transition extended to give researchers and institutions time to adjust with minimal disruption to ongoing work. CMS could work with the Census Bureau to add data to the existing Census network of Federal Statistical Research Data Centers throughout the country (including the Secure Remote Research Environment), which have a strong track record of enabling high-quality research that informs policy. This would benefit from the Census Bureau’s existing infrastructure and its long experience with providing secure access while partnering with researchers and other agencies—including health data from other parts of HHS and from CMS itself. Finally, CMS should consider the implications of the high fees it charges for data access. These costs have implications for equity in access and for the prospect of future healthcare research.
Research is an opportunity, and researchers are a resource
Policymakers in many fields should envy what those in health policy currently have available: a large amount of relevant data, a community of researchers ready and willing to analyze them, and an established infrastructure for connecting those two things and generating useful insights. All agencies that hold data should be aware that ecosystems of this kind are incredibly valuable, try to build them where they do not yet exist, and make every effort to maintain them where they do. It can be difficult to resist limiting access in the face of concerns about privacy or security, but taking that step unilaterally is shortsighted. Refusing to let anyone use existing data may help with security, but will hamstring learning. Engagement with researchers can help maximize the value of an agency’s data assets for informing policy.
Other agencies have recently made improvements to planned privacy-related policy changes following engagement with data users. In early 2022, the Census Bureau announced changes to several privacy protection procedures used to produce the public-use microdata files for the Current Population Survey. The announced changes would have significantly degraded multiple important features of the data and prompted strong reactions from data users, who had not previously had an opportunity to provide input. In response, the Census Bureau promptly revised its plan, largely preserving the most important data capabilities that had been threatened by the originally announced changes while still enhancing privacy protections. CMS should be similarly open to changing course in response to feedback from its data users.
A not-quite-new possibilities problem: balancing a desire to minimize risk with minimizing just about anything else. I’ve been trying to find examples that don’t involve all involve insurance. I’m teaching economics on Substack, to the best of my ability, and the problem of minimizing risk (China’s no-COVID policy) with minimizing other disruptions is one that sits on the back burner. Thanks for this insightful post.