Menu
Tax Notes logo

The Virtues of a Simple Excise Tax on Personal Consumer Data

Posted on Dec. 12, 2022
Robert D. Plattner
Robert D. Plattner

Robert D. Plattner is a former deputy commissioner for tax policy in the New York State Department of Taxation and Finance. He now serves as a senior adviser to New York Senate Finance Committee Chair Liz Krueger (D) and consults on state and local tax policy and New York state tax matters.

In this article, Plattner makes the case for taxing corporations that profit from collecting, analyzing, and using large quantities of personal consumer data.

Big data is not the equivalent of Big Oil. The latter is ExxonMobil, Chevron, Shell, and a handful of other major players in the fossil fuel extraction business. “Big” refers to the companies — their huge physical and economic presence around the globe and the oversized political influence they wield. In contrast, big data is just what it says it is, extremely large data sets, which are often analyzed computationally to reveal patterns, trends, and associations, especially related to human behavior and interactions. Collecting, storing, and analyzing big data is inexpensive nowadays, and analyzed data has proven valuable in numerous business contexts. As a result, collecting and analyzing big data are now standard procedures in much of the business world.

The data sets are big; the businesses amassing that data need not be. That said, the most visible and successful collectors, analyzers, and users of big data also happen to be among today’s largest, wealthiest corporations. These companies, often with websites visited by hundreds of millions, even billions of individual users, have mastered the art of amassing staggering quantities of personal consumer data and spinning it into gold.

I. New York’s Personal Consumer Data Excise Tax

New York Senate bill S.4959 has its origin in the early, dark days of New York’s COVID-19 crisis. Those days were made darker still by the prospect of a huge state budget deficit, the consequence of an economy slowed to a crawl almost overnight. Just one example — all of Broadway was shut down with little more than two hours’ notice, including the much-awaited opening performance of Six, leaving bewildered actors to figure out what next. The outlook was dire, but not every business sector was suffering, and some businesses, including the biggest players in the digital economy, were thriving.

Conceived in this environment, S.4959 had two broad objectives. The first was to bring in substantial new revenue from industries that were thriving despite the pandemic while sparing businesses that were struggling to survive.1 This objective was reinforced by the widely held perception that corporate income taxes paid by the handful of companies that had achieved superstar status did not adequately reflect their remarkable growth in wealth and market power. The second goal was to address widespread sentiment that existed before COVID that New York consumers were not profiting as they ought to from the lucrative use of their personal data.2 “Free” access by the consumer to a company’s website, which costs the company next to nothing to provide, was not a fair trade.3 This perceived injustice was heightened by concerns that consumers were often unaware of how much of their personal information was being collected, how it was acquired, and how it was being used.

Two other objectives were identified at that time, specific to the drafting of the legislation. One was to keep the tax as simple as possible. The second was to make certain the new law did not violate the 1998 Internet Tax Freedom Act or the commerce or due process clauses of the U.S. Constitution. S.4959, as drafted, meets both these objectives. Other states following in New York’s footsteps would also need to draft their legislation to preclude any argument that it violated ITFA or the due process or commerce clauses. Foreign nations would not be constrained in this way. Having provided this reminder to state tax policy makers, this article, intended for an international audience, offers policy choices that are not available to the states without a specific reminder to that effect.

II. Overview: The Virtues of S.4959

In March 2021 Tax Notes published an article entitled “Taxing Big Data: The Severance Tax Model.”4 That article called attention to S.4959, which imposes an excise tax on the collection of data about New York consumers, the culmination of the work begun in the early days of the pandemic.

Sometimes referred to as a “data mining” tax, S.4959 adopts a novel approach. It treats New York consumers’ personal data as a valuable commodity, like oil or precious metals, and establishes an excise tax, not so different from a typical severance tax. The data collector is in effect “extracting” personal information from New York consumers for commercial use. The calculation of the monthly tax is based on the number of New York consumers about whom a business collects data that month. The greater the number of New York consumers, the greater the tax liability.

S.4959 has several strengths. One crucial characteristic is its simplicity. Taxpayers need not calculate either the volume or the value of the data collected from consumers to compute their tax. They simply count the number of distinct New York consumers about whom information is collected monthly and apply the applicable rate. Moreover, because the tax applies only to consumers whose primary residence is in New York, perennial problems like sourcing rules and apportionment formulas do not come into play. Also, businesses with a significant internet presence track and report their number of “unique monthly visitors” or “average monthly users”5 as an indicator of their reach into the consumer population. These monthly statistics provide a huge head start for forecasting revenues, filing returns, and monitoring compliance.

Another virtue of the data-mining tax model is that the basic framework can easily be tailored to suit specific policy goals. Data collectors subject to tax can be broadly defined, as in S.4959,6 or narrowly limited to a few business models — for example, social media sites, marketplace platforms, and search engines. The imposition of the tax can easily be limited to larger businesses by establishing a high threshold amount of consumers whose data is collected before the tax is imposed. Moreover, the rate structure can be fixed or graduated, with the rates and brackets chosen to meet a variety of concerns, including overall revenues and the distribution of the tax burden.

One more advantage of a data-mining tax is that the amount of tax can be expressed on a per consumer basis. That is, after the tax liability is calculated, it can be divided by the total number of consumers to arrive at an average — for example, $4 per consumer per year.

In New York, quite significant revenues can be raised at a very modest price paid by data collectors per consumer. While New York consumers may feel shortchanged when it is disclosed that the biggest data collectors would pay less than $4 a year for their data under S.4959, if the taxpayer paying $4 per consumer collects data on 15 million consumers, the tax liability is $60 million. The taxpayer may balk, but at less than $4 per year per consumer, the number is easily defended based on the value of that data to the data collector.

S.4959 also responds to a broader issue about taxing data — the limitations of using income-based taxes to get internet heavyweights to pay their fair share of the costs of maintaining a civil society. The crux of the issue is that in recent years internet companies at the top of the food chain have been amassing more and more data, which in turn has resulted in the accumulation of enormous wealth and the ongoing concentration of economic power. That wealth is not tied to current income, so taxes on income may prove ineffective at capturing and taxing this growth. S.4959 would directly tax the wealth-producing activity of amassing data. This advantage of taxing data directly could prove to be the most persuasive argument of all in favor of a data-mining tax.

S.4959 also establishes two important premises underpinning the taxation of data. First, it makes clear that the state recognizes the commercial value of the data consumers provide to internet companies in return for free access to their websites. It also stakes out the position that New York consumers have an ongoing interest in their personal data wherever a transaction may occur and that the state has the authority to legislate regarding that data.7

The response to S.4959 from academics in the United States whose focus is state and local taxes has been overwhelmingly favorable,8 and tax professionals representing the digital services industry in state tax matters have begun to display a grudging respect.9 However, interest in S.4959 has not been limited to the state and local tax community. International tax policy experts have also taken note of it, most notably Reuven Avi-Yonah, Christine Kim, and Karen Sam in a forthcoming article in the Harvard International Law Journal, “A New Framework for Digital Taxation.”10 The article advocates replacing digital services taxes, a major obstacle to progress on pillar 1, with data excise taxes.

S.4959, as the authors acknowledge, was one of two proposals that were central in shaping their own data excise tax. Ultimately, however, they rejected a key feature of S.4959 — I believe mistakenly — in favor of an alternative means of calculating tax liability. I will highlight their criticism and respond to it in section III.D.

The potential role of data-mining taxes should not be seen as limited to the replacement of DSTs. Developing countries short on technical infrastructure and experienced staff should find a simple excise tax a much-needed, practical, and efficient way to raise revenue. Countries weaning themselves from heavily taxed fossil fuel products could replace some of that revenue with a data-mining tax, citing data as the successor to oil as the most valuable asset of the most profitable companies on the planet. Most significantly, tax policymakers should carefully consider the imposition of a data excise tax as necessary to tap into the accumulation of enormous wealth resulting from the accumulation of enormous amounts of data. Data is a valuable commodity, with the biggest collectors continually extracting it — and unlike oil, the supply of data is inexhaustible and environmentally benign.

III. S.4959: A Closer Look

A detailed description of S.4959 follows. Some of the provisions described are essential to the structure of a data-mining tax. Others reflect choices made by policymakers in New York from available options. Where choices were made, the available options are noted.

A. Activities Subject to Tax: Collection of Data

S.4959 taxes only the collection of data, not its storage or sale or use.11 This is an immutable feature of the tax. Limiting the imposition of the tax to one activity is essential to avoid multiple taxation of related activities, and collections is the obvious choice as the activity that identifies the source of the data. It is also the activity tracked by industry in reporting average monthly users and unique monthly visitors.

B. Taxpayers: Commercial Data Collectors

The tax is imposed on “commercial data collectors,”12 which are for-profit entities that collect consumer data, other than basic consumer contact information, on more than 1 million New York consumers within a month.

While most of the largest commercial data collectors are internet-based, many others — including big-box retailers, credit card companies, and multimedia entertainment and news organizations — engage in business both on and off the internet. S.4959 explicitly includes business activity that does not take place on the internet.13 To do otherwise would run the risk of violating the ITFA, which prohibits states from taxing digital goods and services more heavily than their real-world counterparts.

The for-profit qualifier excludes, for example, political parties, educational institutions and organizations, religious and charitable organizations, and advocacy groups.

The definition of commercial data collector in S.4959 is extremely broad,14 an intentional design feature — it is easier to exclude entities from the initial group in a rewrite than to add new ones. Policymakers are essentially invited to take the S.4959 definition as a starting point and then carve out exemptions as they see fit. An exemption might be warranted for news organizations, for example. Alternatively, the tax could only apply to a small number of enumerated activities — social media websites, search engines, and platform marketplaces are likely candidates. Clear definitions distinguishing these business activities from others can be found in existing DST statutes.15

C. Consumer Data

Consumer data is defined more narrowly in S.4959 than the term is understood in general use.16 It includes data connected with a particular individual — that is, information that identifies, relates to, describes, or could reasonably be associated with or linked to a specific New York consumer. The term does not include aggregate data — data separated from identifying information that is combined with similar data to provide statistical information about a specified population — average age, income level, education, and the like. To fall within this definition, the data must be stripped from the identifying information before any sale or use takes place.

Another category of data excluded from consumer data is basic consumer contact information, which encompasses name, home address, mailing address, email, telephone, and fax as well as payment information a vendor generally requires to complete a basic sales transaction.17 The intent here is to exclude data that is acquired by small sellers limited in their use of technology in basic sales transactions. The exclusion may be unnecessary given a reasonable threshold before the tax is imposed, and it may be open to abuse by big taxpayers that sell bare-bones data to third parties.

Consumer data includes not only data collected directly from the consumer but also data acquired from other sources18, one of the circumstances in which the analogy to a severance tax breaks down. The reason for including data collected from third parties is straightforward. The data about New York consumers purchased from third parties is no less a New York resource than the data obtained from the consumers themselves. The tax is imposed on the collector based on the total number of unique New York consumers about whom the collector acquires data.19 Auditors may be required to dig for information regarding the existence of third-party contracts and the number of new consumers added as a result. The very biggest data users rely heavily on getting their data directly from the consumers visiting their website.

The examples below illustrate the tax consequences of common transactions involving big data.

1. Example 1

Atom Inc. collects data and contracts for the storage of that data with Ballco. Because Ballco is providing a service for Atom and has no right to use the data other than to perform the service for which it is contracted, Ballco is not a commercial data collector.

2. Example 2

Ableco collects data and contracts with Buzzco to analyze it. Buzzco is not a commercial data collector. Like Ballco in Example 1, Buzzco is providing a service to Ableco, and has no right to use the data for any purpose other than to analyze it as contracted.

A different result occurs under the next fact pattern.

3. Example 3

The Acme cable television company provides cable services to 2 million customers in New York. It routinely amasses consumer data about its customers, including, for example, information about subscribers who pay an amount over and above the basic monthly fee for access to additional programming. Acme analyzes the data for its own business purposes and sells access to this consumer data to Cazco under a multiyear contract. Under these facts, both Acme and Cazco qualify as commercial data collectors for the data originally collected from New York consumers by Acme. Acme could in fact sell the data to multiple parties for nonexclusive use. Here again the analogy to a severance tax breaks down, but the better policy is to tax both Acme and Cazco as data collectors. The same data, unlike a barrel of oil, can be in multiple places at the same time.

4. Example 4

Ajax, a content provider that relies entirely on subscriptions for revenue, collects data from Mr. and Mrs. Jones that are stripped from their personal identification information and used in developing aggregate statistics. As long as the data collected consists solely of data stripped and aggregated by Ajax before a sale, it does not constitute consumer data.

5. Example 5

Ahab, a big-box retailer, maintains a modest amount of data about its customers, including a history of their purchases and returns over the years. The data constitutes consumer data.

6. Policy Choices

While the issue of multiple taxation of the same data may be raised, the argument that this constitutes a problem that needs a fix is unpersuasive. As noted, any number of companies will have the same data on file about the same consumer at the same time. That circumstance has no impact on the industry’s calculation of unique monthly visitors and should not alter the calculation of the number of consumers about whom a collector gathers data.

The exclusion for basic contact information should be reviewed and potentially repealed.

D. The Tax Liability Calculation

The measurement of the data-mining tax is another of its immutable features — the key, one might argue, to its simplicity and practicality.20 Measuring the tax based on the number of consumers about whom data is collected best reflects the way the industry itself measures the value of the data collected. The digital advertising industry maintains a statistic known by one of two names — “unique monthly visitors” or “average monthly users.” Both translate to the number of different consumers on whom a data collector collects data over the course of a month. This figure represents the reach of the data collector’s website to consumers. The higher the number of average monthly users, the more the data collector can charge for advertising on its website.

The existence and availability of this industry statistic greatly facilitates compliance and enforcement. It is a relatively easy matter to calculate the tax once the number of average monthly users is known, and there is a clear business incentive to report a higher rather than lower head count, a strong incentive not to understate the number of consumers for tax purposes. Also, measuring the tax by head count fulfills the goal of assigning value to the consumer data collected by data collectors.

The alternatives to using consumer head count are the typical measures of an excise tax — the value of the commodity or the volume/weight of the commodity at the time of a transaction. For sound practical and policy reasons, neither of these two measures works in the context of consumer data. Imputing value to data in a transaction — for example, between YouTube and a user in which the consumer has not paid for the service provided and the service provider has not paid for the data it receives — would be an artificial exercise at best.

Similarly, taxing data based on volume would produce results that make little sense. Ten barrels of crude oil are worth approximately if not exactly half of what 20 barrels are worth — the amount of oil and its value are closely tied to each other. The same is not true of data, which often costs virtually nothing to acquire and is very inexpensive to store, and so huge amounts of it are collected and maintained, some of which proves to be of little value. Imposing a tax based on the volume of data would artificially link the total volume of data to its total value, producing arbitrary results.

Avi-Yonah, Kim, and Sam have nonetheless chosen to base their excise tax measurement on the volume of data, criticizing the use of a consumer head count as “neither fair nor efficient.” I will simply reiterate that the industry has chosen otherwise, using reach as measured by head count to make everyday decisions about where to advertise and how much to pay for that advertising. If there were a better indicator of the value of data than head count, industry would already be making use of it.21

E. New York Consumer

A New York consumer is defined as a consumer whose primary residence is in New York state, as determined under the state’s personal income tax rules.22 S.4959 creates a rebuttable presumption that a consumer whose information on record with, or available to, a commercial data collector that indicates a New York state home address, a New York mailing address, or an internet protocol address connected with a New York location is a New York consumer.23 A credit is available to a taxpayer if another state with an equivalent tax determines that a consumer considered to be a New York consumer under New York law is a resident of the other state.24 The credit prevents double taxation and precludes any claim of discrimination against interstate commerce.

The number of unique monthly visitors is generally calculated at the national, not state, level. In light of this limitation, the law specifically allows the tax department and taxpayers to agree on a methodology to determine the number of New York consumers other than calculating a precise head count.25 One likely methodology would be to calculate the New York consumer head count by multiplying a national figure by New York’s percentage of the population, approximately 6 percent.

There is flexibility to discuss alternatives in determining a consumer’s home state. Rules specific to another country might dictate a change from the New York standard of primary residence, which has clear meaning in the New York tax law but may not in other jurisdictions. The availability of a credit allows for an expansive definition of resident as long as the country imposing the tax defers to the competing jurisdiction should residence be contested.

F. Rate Schedule and Threshold

The tax imposed under S.4959 is calculated using a graduated rate table,26 an unusual feature for an excise tax but a good choice in this instance. While I am not aware of any study on it, one argument made in favor of a graduated rate is that at higher levels of data collection, the value of 2X data is more than twice the value of 1X data. Also, while the volume of data does not enter into the calculation of a data-mining tax, there is anecdotal evidence that the companies with the greatest market reach also maintain the most data per consumer. A third argument in favor of a graduated rate is that policymakers are aware that the big players would bear a greater tax burden under a graduated rate structure and are comfortable directing an increased burden their way.

The threshold for being subject to tax is collecting data on more than 1 million New York consumers in a month, approximately 5 percent of the state’s population.27 This minimum requirement will knock out most businesses. Keeping smaller, less technologically savvy businesses out of the taxpayer base makes administration of the tax easier for the state and spares smaller businesses an additional tax burden.

The details of the rate structure above the 1 million consumer threshold reflect several factors:

  • New York’s population;

  • the number of average monthly users reported nationally by the most significant commercial data collectors;

  • a policy choice regarding the appropriate level of taxation of taxpayers from the low end to the high end of the liability range; and

  • a policy choice regarding the overall revenue target.

Under the rate structure,28 a taxpayer collecting data from just over 1 million customers — say, 1,000,010 — would owe 5 cents a month on just 10 taxpayers for the year — that is, $6. The rate per month per consumer rises gradually, avoiding notch effects. At 10 million New York consumers, the tax is $2.70 per consumer per year, for a total liability of $27 million.

The threshold level is a particularly important policy tool. It determines the size and composition of the taxpayer population and plays a significant role in the distribution of the tax burden. The configuration of the rate and bracket structure has unlimited possibilities. For example, S.4959 has an unusually large number of brackets. Having more brackets minimizes notch effects without complicating compliance. Industry reports on average monthly users will help inform decisions on choosing rate schedules and thresholds to hit a specific revenue target.

IV. Conclusion

This article introduces a new tax construct — a data-mining tax — to policymakers as they continue to grapple with the appropriate income tax treatment of some of the world’s most successful corporations, creations of the digital economy whose greatest asset is the staggering amount of personal consumer data they continue to accumulate. As highlighted throughout this article, an excise tax on data is a better choice than a DST to complement pillar 1. Most significantly, unlike an income tax, a data-mining tax can generate revenue from the accumulation of personal consumer data by these companies that has so dramatically added to their wealth and power.

FOOTNOTES

1 New York’s financial condition changed dramatically with the passage of federal legislation providing generous financial assistance to the states. New York’s tax revenues exceeded forecasts for a time as well, and the state has been operating with a surplus. As a consequence, there has been little interest in acting on S.4959. It is likely, however, that this rosy financial picture will not last long.

2 There is less concern in New York than overseas about the amount of corporate income tax paid by the internet giants. The New York corporate income tax underwent sweeping reform in 2014 and 2015, and New York now imposes its taxing authority based on economic nexus and apportions income using a single-sales factor formula with customer-based sourcing. That is not to say the states are not vulnerable to international tax avoidance schemes. State corporate income taxes generally piggyback on the federal tax, using a line from the federal return as their starting point, leaving the states reliant on the IRS to address international tax concerns.

3 State and local sales taxes are not levied on the free access consumers are given to websites such as Google, Facebook, and YouTube. The access, of course, is not truly free. What’s really transpiring is a barter transaction, with free access exchanged for valuable personal data. The failure to capture sales tax on these barter arrangements leaves what should be a major contributor of revenue out of the sales tax base.

4 Robert D. Plattner, “Taxing Big Data: The Severance Tax Model,” Tax Notes State, Mar. 22, 2021, p. 1227. The term “severance tax” is used to describe a state tax imposed on the extraction of nonrenewable natural resources.

5 The two terms are used interchangeably. Both describe consumers who visit a particular website regularly.

6 S.4959, adding section 186-h of the Tax Law, section 186-h(2)(a).

7 For example, the state would exercise its taxing authority over Company A as a data collector when Company A, with no connection to New York, buys New York personal computer data from Company B, which also has no connection to New York other than its ownership of New York consumer data. The state would assert that New York consumers retain an interest in their consumer data wherever it goes, and New York will enforce its consumer protection and tax laws regarding that data.

8 See, e.g., Peter Enrich et al., “The Maryland and New York Approaches to Taxing the Data Economy,” Tax Notes State, Apr. 12, 2021, p. 147; Andrew Appleby, “Subnational Digital Services Taxation,” 81 Md. L. Rev. 1, at 22-23 (2022).

9 Joe Crosby et al., “Served Up on a Plattner: A Response to Big Data Tax Proposals,” Tax Notes State, May 24, 2022, p. 817. The criticisms came in bunches — economic arguments, like pyramiding, compliance concerns like sourcing, philosophical concerns that “a data tax is as crazy as a tax on oxygen,” and many more. The opposition strategy was articulated this way: “Bad ideas spread like a virus unless they are shut down quickly.”

By July 2022, the tone had changed some. See Lauren Loricchio, “New York’s Data Collection Proposal Raises Concerns,” Tax Notes Today State, July 14, 2022. After a discussion of the Maryland gross receipts tax on digital advertising revenue, the discussion turned to New York’s data excise tax. Doug Lindholm, executive director of the Council On State Taxation, told the audience, “The one that scares me the most is New York’s S.4959. The real problem here is if you look at the amount of the tax . . . it is a huge, huge dollar amount that this bill would generate.” Stephen Kranz of McDermott Will & Emery acknowledged that Plattner, who wrote the legislation, “was a very smart guy” who tried to avoid the ITFA issues and constitutional issues. Kranz concluded, “I’m not saying he got it right . . . but he tried to.”

10 Reuven Avi-Yonah, Young Ran (Christine) Kim, and Karen Sam, “A New Framework for Digital Taxation,” 63 Harv. Int’l L.J. 400 (2022).

11 S.4959, section 186-h(1).

12 S.4959, section 186-h(2)(a).

13 S.4959, section 186-h(1).

14 S.4959, section 186-h(2)(a).

15 See, e.g., U.K. Public General Acts 2020, c.14, part 2, section 43.

16 S.4959, section 186-h(2)(d). The states are limited in their policy options in this regard by ITFA.

17 S.4959, section 1, adding section 186-h(2)(c).

18 S.4959, section 186-h(1).

19 S.4959, section 186-h(4)(b).

20 S.4959, section 186-h(3).

21 The article by Avi-Yonah, Kim, and Sam, supra note 10, offers several additional criticisms — some their own, some citing other critics. The first drawback of the bill noted is its “extreme effectiveness” at raising revenue. This “flaw” is not inherent to a data-mining tax — changes to the rate and bracket structure and threshold can be calibrated to hit a lower revenue target. But why consider it a flaw? The primary goal of the legislation was to raise a lot of revenue from businesses that were thriving during the pandemic, without doing harm to those who were suffering. S.4959 might do that, defending its rate structure as taxing data collectors less than $4 yearly for each consumer’s data.

It’s worth noting that, as discussed in note 3 supra, no sales tax is imposed on the free access to websites involved in barter transactions. If the state’s revenue agency imputed a value of $5 a month to access to Google’s website, the annual cost would be $60, and the sales tax paid at an 8 percent rate would be $4.80. If Google had 15 million users in New York, it would generate $72 million in sales tax revenue, about 20 percent more than S.4959. Indeed, the revenue agency could impute a value of $5 a month to the data transferred to Google in the barter arrangement. This would generate an additional $72 million in sales tax revenue. Just about any imposition of tax will sound big when it’s multiplied by 15 million.

The criticisms passed along from the Council On State Taxation and others are the usual objections, sometimes meritorious, offered whenever a tax increase on businesses is proposed. Business inputs should not be taxed; businesses will leave the jurisdiction or stop doing business there; compliance will be burdensome; the tax will be passed through to consumers. In this instance, none of the arguments are persuasive. Also, in case there is confusion, while I sometimes use the term “head count” to describe the way the data-mining tax is calculated, the tax is not a “head tax,” a term referring to a regressive tax that charges each taxpayer the same amount regardless of ability to pay.

22 Tax Law, article 22.

23 S.4959, section 186-h(4)(a).

24 S.4959, section 186-h(6).

25 S.4959, section 186-h(5).

26 S.4959, section 186-h(3).

27 Id.

28 S.4959, section 186-h(3).

END FOOTNOTES

Copy RID