Estimating Firm Digitalization: A Method for Disaggregating Sector-level Digital Intensity to Firm-level

Mucha TomaszSeppälä Timo


The digital transformation of firms plays an increasingly important role in the economy and society. However, limited access to data on firm-level digital intensity is an impediment to advancement of multiple research projects concerned with firm digitalization. To alleviate this challenge, this paper proposes a method for estimating firm-level digital intensity based on other more readily available firm-level data and reference data on digitalization, which is available on sector-level. The proposed method utilizes firm-level revenue breakdown by sector to estimate sector revenue-weighted digital intensity scores, which lead to classification of firms into low, medium and high digital intensity groups. The output from the proposed method can be directly used in research concerned with firm digitalization and investigating this multifaceted phenomenon. Results from the application of the proposed method to an illustrative sample of large US and non-US firms (2000 observations in total) indicate that firm-level digital intensity can be efficiently estimated for large samples using data commonly available to researchers.

The key differences between the proposed method and alternative methods are:

  • Recognition of the fact that firms might participate in more than one sector or industry, which partially explains within-sector heterogeneity in firm-level digital intensity. We found that 67.8% of large US firms and 78.6% of large non-US firms were engaged in more than one industry.
  • Use of reference sector-level digital intensity scores, which allows for rapid update, application across geographies and time, as well as parallel calculation of multiple digital intensity scores for each reference data. Furthermore, use of reference data enables supplementation of firm-level data on digitalization.
  • Replicability of the method and reproducibility of the results through inclusion of the source code and availability of data through research and commercial databases.

Read more MethodsX (2021).

Information om publikationen

Digital transformation, Digital taxonomy, IT intensity, Data disaggregation
Publikationen i annan publikationstjänst